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Preface 



This volume contains the papers presented at the 27th International Colloquium 
on Automata, Languages and Programming (IC ALP 2000), which took place at 
the University of Geneva, Switzerland, July 9-15, 2000. 

The volume contains 69 contributed papers, selected by the two program 
committees from 196 extended abstracts submitted in response to the call for 
papers: 42 from 131 submissions for track A (Algorithms, Automata, Complexity, 
and Games) and 27 from 65 submissions for track B (Logic, Semantics, and 
Theory of Programming). Moreover, the volume includes abstracts of a plenary 
lecture by Richard Karp and of invited lectures by Samson Abramsky, Andrei 
Broder, Gregor Engels, Oded Goldreich, Roberto Gorrieri, Johan Hastad, Zohar 
Manna, and Kurt Mehlhorn. 

The program committees decided to split the EATCS best paper awards 
among the following three contributions: Deterministic algorithms for k-SAT 
based on covering codes and local search, by Evgeny Dantsin, Andreas Goerdt, 
Edward A. Hirsch, and Uwe Schoning, Reasoning about idealized Algol using reg- 
ular languages, by Dan R. Ghica and Guy McCusker, and An optimal minimum 
spanning tree algorithm, by Seth Pettie and Vijaya Ramachandran. 

The best student paper award for track A was given to Clique is hard to 
approximate within by Lars Engebretsen and Jonas Holmerin, and for 

track B to On deciding if deterministic Rabin language is in Biichi class, by 
Tomasz Fryderyk Urbanski. 

We thank all of the authors who submitted papers, our invited speakers, the 
external referees we consulted, and the members of the program committees, 
who were: 



Track A 

• Peter Bro Miltersen, U. Aarhus 

• Harry Buhrman, CWI Amsterdam 

• Martin Dietzfelbinger, TU llmenau 

• Afonso Ferreira, inria Sophia Ant. 

• Marcos Kiwi, U. Chiie 

• Jens Lagergren, KTH Stockhoim 

• Gheorghe Paun, Romanian Acad. 

• Gunter Rote, FU Beriin 

• Ronitt Rubinfeid, NECI 

• Amin Shokroiiahi, Beii Labs 

• Luca Trevisan, Goiumbia Li. 

• Serge Vaudenay, EPF Lausanne 

• Emo Weizi, Chair, ETH Zurich 

• Uri Zwick, Tei Aviv U. 
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• Ranee Gieaveiand, Stony Brook 

• Pierpaoio Degano, U. Pisa 

• Jose Fiadeiro, Li. Lisbon 

• Andy Gordon, Microsoft Gambridge 

• Orna Grumberg, Technion Haifa 

• Giaude Kirchner, iNRiA Nancy 

• Ugo Montanari, Chair, U. Pisa 
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• Joachim Parrow, KTH Stockhoim 
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We gratefully acknowledge support from the Swiss National Science Founda- 
tion, from the computer science department of the University of Geneva, from the 
European agency INTAS, and from the EATCS. Finally, we would like to thank 
the local arrangement committee members - Olivier Powell, Frederic Schiitz, 
Danuta Sosnowska, and Thierry Zwissig - and Germaine Gusthiot for the sec- 
retarial support. 
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Game Semantics: Achievements and Prospects 



Samson Abramsky 

Department of Computer Science 
University of Edinburgh 
http: //www. dcs . ed. ac.uk/home/samson/ 



Abstract. Game-theoretic ideas have been used to model a wide range 
of computational notions over the past few years. This has led to some 
striking results of a foundational character, relating in porticular to de- 
finability and full abstraction. The first applications of these ideas, to 
program analysis and verification, have also begun to appear. We shall 
give an overview of what has been achieved, and try to map out some 
objectives for future research. 



U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, p. 1, 2000. 
© Springer- Verlag Berlin Heidelberg 2000 




Clique Is Hard to Approximate within 



Lars Engebretsen and Jonas Holmerin 

Royal Institute of Technology 
Department of Numerical Analysis and Computing Science 
SE-100 44 Stockholm, SWEDEN 
Fax: +46 8 790 09 30 
{enge , joho}@nada.kth. se 



Abstract. It was previously known that Max Clique cannot be approx- 
imated in polynomial time within for any constant e > 0, unless 

NP = ZPP. In this paper, we extend the reductions used to prove 
this result and combine the extended reductions with a recent result of 
Samorodnitsky and Trevisan to show that clique cannot be approximated 
within ^ 1 - 0 ( 1 / Viog log n) ^ 2PTIME(2 °(‘°s"('°®'°s’")"^')). 



1 Introduction 

The Max Clique problem, i.e., the problem of finding in a graph G = {V,E) the 
largest possible subset C of the vertices in V such that every vertex in C has edges 
to all other vertices in C, is a well-known combinatorial optimization problem. 
The decision version of Max Clique was one of the problems proven to be NP- 
complete in Karp’s original paper on NP-completeness |U|, which means that 
we cannot hope to solve Max Clique efficiently, at least not if we want an exact 
solution. Thus, attention has turned to algorithms producing solutions which 
are at most some factor from the optimum value. It is trivial to approximate 
Max Clique in a graph with n vertices within n — just pick any vertex as the 
clique — and Boppana and Halldorsson 0 have shown that Max Clique can be 
approximated within 0(n/log^n) in polynomial time. It is an astonishing, and 
unfortunate, result that it is hard to do substantially better than this. In fact, 
the Max Clique problem cannot be approximated within ^ for any constant 
e > 0, unless NP = ZPP. The first to explore the possibility of proving strong 
lower bounds on the approximability of Max Clique were Feige et al. [3|, who 
proved a connection between Max Clique and probabilistic proof systems. Their 
reduction was then improved independently by Bellare, Goldreich, and Sudan |3| 
and Zuckerman m As the final link in the chain, Hastad |S| constructed a 
probabilistic proof system with the properties needed to get a lower bound of 

Since the hardness result holds for any arbitrarily small constant e, the next 
logical step to improve the lower bound is to show inapproximability results for 
non-constant e. However, Hastad’s proof of the existence of a probabilistic proof 
system with the needed properties is very long and complicated. This has, until 
now, hindered any advance in this direction, but recently, Samorodnitsky and 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 2-|l^ 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 



Clique Is Hard to Approximate within n} 3 

Trevisan E! constructed another probabilistic proof system with the needed 
properties, but where the proof of correctness is much simpler. Armed with this 
new construction, new results are within reach. 

In this paper, we show that it is indeed impossible to approximate Max Clique 
in polynomial time within n^~'^ where e € 0(l/\/loglog n), given that NP does 
not admit randomized algorithms with slightly super-polynomial expected run- 
ning time. To do this we first ascertain that the reductions from probabilistic 
proof systems to Max Clique work also in the case of a non-constant e. 

This has the additional bonus of collecting in one place the various parts of the 
reduction, which were previously scattered in the literature. We also extend the 
previously published reductions to be able to use the construction of Samorod- 
nitsky and Trevisan which characterizes NP in terms of a probabilistic 
proof system with so called non-perfect completeness. To our knowledge, such 
reductions have not appeared explicitly in the literature before. 

When we combine the new reductions with the probabilistic proof system 
of Samorodnitsky and Trevisan HD, we obtain the following concrete result 
regarding the approximability of Max Clique: 

Theorem 1. Unless NP C Max Clique on a 

graph with n vertices cannot be approximated within 7 ji- 0 (i/%/iog logn) polyno- 
mial time. 

As a comparison, the best known polynomial time approximation algorithm j^, 
approximates Max Clique within "/ '°s ") . We omit several proofs from 

this extended abstract. They are contained in the full version of the paper, 
available from the authors’ home pageJ3 

2 Preliminaries 

Definition 1. Let P be an NP maximization problem. For an instance x of P 
let opt(a:) be the optimal value. A solution y with weight w{x, y), is c-approximate 
if it is feasible and w{x,y) > opt{x)/c. 



Definition 2. A c-approximation algorithm for an NP optimization problem P 
is a polynomial time algorithm that for any instance x and any input y outputs 
a c-approximate solution. 

We use the wording to approximate within c as a synonym for to compute a 
c-approximate solution. 

Definition 3. Max Clique is the following maximization problem: Civen a graph 
G = (V, E) find the largest possible C Q V such that if v\ and V 2 are vertices 
in C, then {vi,V 2 ) is an edge in E. 
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Definition 4. G-gap E3-Sat-5 is the following decision problem: We are given 
a Boolean formula 4> in conjunctive normal form, where each clause contains 
exactly three literals and each literal occurs exactly five times. We know that 
either (j) is satisfiable or at most a fraction G of the clauses in (j are satisfiable 
and are supposed to decide if the formula is satisfiable. 

We know from [Z| that G-gap E3-Sat-5 is NP-hard. 

2.1 Previous Hardness Results 

A language L is in the class NP if there exists a polynomial time Turing ma- 
chine M, with the following properties: 

— For instances x G L, there exists a proof tt, of size polynomial in |a;|, such 
that M accepts 

— For instances x ^ L, M does not accept {x,tt) for any proof tt of size poly- 
nomial in |a:|. 

Arora and Safra [2 used a generalization of the above definition of NP to de- 
fine the class PCP[r, g], consisting of a probabilistically checkable proof system 
(PCP) where the verifier has oracle access to the membership proof, is allowed 
to use r(n) random bits and query q(n) bits from the oracle. 

Definition 5. A probabilistic polynomial time Turing machine V with oracle 
access to tt is an (r, g)-restricted verifier if it, for every oracle tt and every input 
of size n, uses at most r{n) random bits and queries at most q{n) bits from the 
oracle. We denote by the verifier V with the oracle tt fixed. 

Definition 6. A language L belongs to the class PCP[r, g] if there exists a 
(r, q) -restricted verifier V with the following properties: 

— For instances x G L, Vrp\V'" accepts {x,p)] = 1 for some oracle tt. 

— For instances x^F, Vip\V'^ accepts {x,pf\ < 1/2 for all oracles tt. 

Above, p is the random string of length r. 

The connection between the approximability of Max Clique and PCPs was first 
explored by Feige et al. 0, who showed that 

NP C PCP[0(lognloglogn),0(lognloglogn)] (1) 

and used this characterization of NP and a reduction to show that Max Clique 
cannot be approximated within any constant unless 

NP C DTIME(u‘^(^°s^°s")). (2) 

The assumption on NP needed to prove hardness result on the approximabil- 
ity of Max Clique is closely related to the connection between the classes NP 
and PCP[r, q] for various values of r and q. This connection was the subject of 
intensive investigations leading to the following result of Arora et al. P: 
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Theorem 2. NP = PCP[0(logn), 0(1)]. 

A consequence of this result is that the abovementioned assumptions in the proof 
of Feige et al. jS] could be weakened to P = NP. 

A technical tool in the proof of Feige et al. |S| is the construction of a 
graph Gv,oot corresponding to a verifier in some proof system and some input x. 

Definition 7. From a verifier V and some input x, we construct a graph Gv,x 
as follows: Every vertex in Gv,x corresponds to an accepting computation of 
the verifier. Two vertices in Gy,x are connected if they correspond to consistent 
computations. Two computations IIi and II 2 are consistent if, whenever some 
bit b is queried from the oracle, the answers are the same for both 7Ti and il 2 . 

In the original construction, the number of vertices in Gy,x was bounded by 
2 r{n)+q(n)^ where r{n) is the number of random bits used by the verifier and 
q{n) is the number of bits the verifier queries from the oracle. Feige et al. sug- 
gest in their paper that the bound on the number of vertices in Gy,x could be 
improved, and it was later recognized that the number of vertices can be bounded 
by where /(n) is the free bit complexity. 

Definition 8. A verifier has free bit complexity / if the number of accepting 
computations is at most 2^ for any outcome of the random bits tossed by the 
verifier. 

Definition 9. A language L belongs to the class FPCPc,s[r, /] if there exists 
verifier V with free bit complexity f that given an input x and oracle access to tt 
tosses r independent random bits p and has the following properties: 

— for instances x € L, Vrp\V^ accepts (x, p)] > c for some oracle tt. 

— for instances x ^L, Vrp\V^ accepts {x,p)] < s for all oracles tt. 

We say that V has completeness c and soundness s. 

To understand the intuition behind the free bit complexity of a proof system, it is 
perhaps best to study the behavior of a typical verifier in a typical proof system. 
Such a verifier first reads a number of bits, the free bits, from the oracle. From 
the information obtained from those bits and the random string, the verifier 
determines a number of bits, the non-free bits, that it should read next from the 
oracle and the values these bits should have in order for the verifier to accept. 
Finally, the verifier reads these bits from the oracle and check if they have the 
expected values. 

Theorem 3. Suppose that L S FPCPc^s]?", /]• Ls-t x be some instance of L, 
and construct the graph Gy,x cls in Definition [71 Then, there is a clique of size 
at least c2’’ in Gy,x if x € L, and there is no clique of size greater than s2’’ if 
X ^ L. 

Proof. First suppose that x G L. Then there exists an oracle such that a frac- 
tion c of all random strings make the verifier accept. The computations corre- 
sponding to the same oracle are always consistent, and thus there exists a clique 
of size at least c2’' in Gv,x- 
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Now suppose that x ^ L and that there is a clique of size greater than s2’’ 
in Gv,x- Since vertices corresponding to the same random string can never repre- 
sent consistent computations, the vertices in the clique all correspond to different 
random strings. Thus, we can use the vertices to form an oracle making the ver- 
ifier accept with probability larger than s. This contradicts the assumption that 
the PCP has soundness s. 



Corollary 1. Suppose that NP C FPCPc,s[0(logn), /] for some constants c, 
s, and f . Then it is impossible to approximate Max Clique within cj s in polyno- 
mial time unless P = NP. 

Proof. Let L be some NP-complete language and x be some instance of L. Let 
B be some polynomial time algorithm approximating Max Clique within c/s. 

The following algorithm decides L: Construct the graph Gv,x corresponding 
to the instance x. Now run B on Gv,x- If B determines that Gy.x has a clique 
containing more than s2’’ vertices, where r G 0(log(n)) is the number of random 
bits used by the verifier, accept x, otherwise reject. 

Since the number of random bits used by the verifier is logarithmic and the 
number of free bits is a constant, the graph Gy.x has polynomial size. Since B is 
a polynomial time algorithm, the above algorithm also runs in polynomial time. 

It is possible to improve on the above result by gap amplification. The simplest 
form of gap amplification is to simply run a constant number of independent 
runs of the verifier. If any of the rounds causes the verifier to reject, we reject, 
otherwise we accept. This shows that, for any constant fc, 

FPCP cArJ] Q FPCP,,^,.[kr,kf], ( 3 ) 

for any functions c, s, r, and /, which strengthens Corollary ^ to 

Corollary 2. Suppose that NP C FPCPc,s[0(logn), /] for some constants c, 
s, and f . Then it is impossible to approximate Max Clique within any constant 
in polynomial time unless P = NP. 

The above procedure can improve the inapproximability result from a specific 
constant c/s to any constant, but to improve the inapproximability result from 
n“ to n“ for some constants a and o', we have to use a more sophisticated form 
of gap amplification. Also, the concept of free bit complexity needs to be refined. 
To see why the above procedure fails in this case, suppose that we have some 
proof system which gives a graph Gy^x with n = vertices such that we can 
deduce that it is impossible to approximate Max Clique within n“ in polynomial 
time. Put another way, this particular proof system has c/s = n“. Now we try 
to apply the above gap amplification technique. Then we get a new graph Gy ^x 
with 2'=’'+'=/ = A vertices and a new inapproximability factor A / s^ = 

Thus, we have failed to improve the lower bound. Obviously, it is not only the 
free bit complexity of a proof system that is important when it comes to proving 
lower bounds for Max Clique, but also the gap, the quotient of the soundness 
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and the completeness. We see above that an exponential increase in the gap does 
not give us anything if the free bit complexity and the number of random bits 
increase linearly. Bellare and Sudan U recognized that the interesting param- 
eter is //logs“^ in the case of perfect completeness. This parameter was later 
named the amortized free bit complexity and denoted by /. Note that the above 
gap amplification does not change /. Two methods which do improve the lower 
bound in the case above by keeping down the number of random bits needed to 
amplify the gap have appeared in the literature lam, and both prove the same 
result: If every language in NP can be decided by a proof system with loga- 
rithmic randomness, perfect completeness, and amortized free bit complexity /, 
then Max Clique cannot be approximated within in polynomial time, 

unless NP = ZPP. The constructions are valid for any constant / and some 
arbitrarily small constant e > 0, and they use the same principle as the above 
gap amplification: They perform consecutive, although not independent, runs of 
the verifier and accept if all runs accept. 



2.2 A New Amortized Free Bit Complexity 

For the case of non-perfect completeness, Bellare et al. |3| define the amortized 
free bit complexity as //log(c/s). In this paper, we propose that this definition 
should be modified. 



Definition 10. The amortized free bit complexity for a PCP with free bit com- 
plexity f , completeness c and soundness s is 



- /-flogc ^ 
log(c/s) 



( 4 ) 



Note that both this definition and the previous one reduce to //logs“^ in the 
case of perfect completeness, i.e., when c = 1. Note also that the above gap am- 
plification does not change the amortized free bit complexity, neither with the 
original definition nor with our proposed modification of the definition. However, 
our proposed definition is robust also with respect to the following: Suppose that 
we modify the verifier in such a way that it guesses the value of the first free bit. 
This lowers the free bit complexity by one, and halves the completeness and the 
soundness of the test. With our proposed definition, the amortized free bit com- 
plexity does not change, while it decreases with the definition of Bellare et al. |S|. 
In the case of perfect completeness, the lower bound on the approximability in- 
creases as the amortized free bit complexity decreases. This makes it dubious to 
have a definition in the general case that allows the free bit complexity to be 
lowered by a process as the above. Using our proposed definition of the free bit 
complexity, we first establish that the construction of Zuckerman PI works also 
in the case of non-constant parameters: 

Theorem 4. If NP C FPCPi,,[r,/], then, for any r € l7(logn) and any R > 
r, it is impossible to approximate Max Clique in a graph with N vertices within 
jjj polynomial time unless 

NP C coRTIME(2®(^+/+^^^)). 
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In the case where / is some constant and r G O(logn), this reduces to the well 
known theorem that Max Clique cannot be approximated within for 

any constant e > 0, unless NP = ZPP. To see this, just choose R = r(e above. 
We also investigate the case of non-perfect completeness. By using the same 
approach as above — performing consecutive, although not independent, runs of 
the verifier and accepting if all runs accept — we obtain the following theorem, 
which is implicit in the works of Bellare et al. 0 and Zuckerman m- 

Theorem 5. If NP C FPCPc,s[?', /], then, for any r G l7(logn), and any 
R > r such that c^2^/2 > 2’’, where D = {R + 2)//logs“^, Max Clique in 
a graph with N vertices cannot be approximated within in 

polynomial time unless 



NP C BPTIME (2® (6) 

Note that we in our applications choose R such that the term (r -|- l)/i? in the 
above theorem is small. 

When amplifying the gap of a PCP with non-perfect completeness, it seems 
more natural to use an accept condition different from the above: Instead of 
accepting when all runs of the verifier accept, accept when some fraction v of 
the runs accept. We investigate the consequences of this new condition and show 
that using that condition we can construct a reduction without two-sided error 
also in the case of non-perfect completeness. The parameters of interest turns 
out to be 



= 



f + {l-v)\og{q-f + l) + l 



( 7 ) 



where q is the number of query bits in the verifier, z/ is a parameter which is 
arbitrarily close to c, and 



n{v,s) = -i/log - - (1 - v)log 

s I — s 



( 8 ) 



We can then prove the following theorem: 

Theorem 6. Suppose every language in NP can be decided by a PCP with 
completeness c, soundness s, query complexity q, and free bit complexity f. Let 
p, and V be any constants such that p > 0 and s < v < c. Let h = ((1 -I- p)c — 
p — v)/(l — v). Then, for any R > r, it is impossible to approximate Max Clique 
in a graph with N = vertices within 

^l/(l-HF„)-r/iJ-(log /»-!)//{ 

by algorithms with expected polynomial running time unless 



NP C ZPTIME(2®(^+^‘'+-^^‘')). 



( 10 ) 
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If Fy is a constant and r(n) £ O(logn), the above theorem says that Max Clique 
is hard to approximate within for ly arbitrarily close to c if we 

choose z/ = (l + fj,)c—2fj,, fi small enough and R = r/e and in the above theorem. 

This might seem worse than in the case with two-sided error, where the 
interesting parameter was f = {f + logc“^)/log(c/s) instead of When c = 
1/2, s is small and u is close to c, this is indeed the case — then is about 2/. 
However, when c is close to 1, s is small and the PCP has reasonable low query 
complexity, we expect / and F^ to be close. 

3 Hardness of Approximating Max Clique 

In their recent paper, Samorodnitsky and Trevisan im give a new PCP for NP 
with optimal amortized query complexity, 1 -|- e for any constant e > 0. 

Theorem 7 (Implicit in jllj i. For any positive integer k and any constants 
e > 0 and (5 > 0, 

NP C FPCP(^_^),2 2-.2^JO(logn),2fc]. (11) 

This result implies that the test has free bit complexity e, for any constant e > 0. 
Since the construction is much simpler than the construction of Hastad jO], with 
reasonable effort it is possible to work through the construction with a non- 
constant e. This yields the following theorem (we omit the proof): 

Theorem 8. For any increasing function k(n) and any decreasing functions 
e{n) > 0 and S{n) > 0, G-gap E3-Sat-5 has a PCP which has query complex- 
ity q = k^-\-2k, free bit complexity f = 2k, completeness c > (1 — e)^ , soundness 
s < 2~^ S, and uses 

r < C'G{logn + 3k)log{{2e-^)S-‘^) + {2k + k"^ loge-^) (12) 

random bits, for some constants Cq and Cq- 

When we combine the above theorem with Theorem H we obtain the proof of 
Theorem ^ 

Proof (of Theorem^. The proof is just a matter of finding suitable choices for 



the parameters involved: q, /, s, c, k, and R. By putting e = k ^ and 6 = 2 ^ 
in Theorem|3 we get that c > e“^, s < 2^~^ , and 

r < G(j(logn -k 3k) log{2k'^2^'^") + {2k + 2k^ log k){2k^2^’^")^o . (13) 

If we let 

k{n) = co\/log logn, (14) 

where Cg < 1/2Cq, we get 

< (loglogn)/2GG, (15) 

2^'=' < (logn)i/^G^ (16) 
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which implies that r is dominated by the first term. If we substitute our choice 
of k in this term, we obtain r = O(lognloglogn). Now we set i/ = (l + ^)c — 2^, 
where fj, is some arbitrary constant such that s < v < c. Then 



F. 



s) = i/k'^ + 0(1) 
2k + 0{logk) 2 

uk'^ + 0(1) vk 



o{l/k). 



If we set R = r/F^, in Theorem we get that 



^ ^ {1 + fi)c - n- ly 
\-v 



> M 



(17) 

(18) 



(19) 



and that it is impossible to approximate Max Clique in a graph with N = 
2 ^ / F^+r+ 2 Fi, v 0 ]-tices within 

^l/(l+F^)-r/R-(\o^h~^)/R ^ j^l-2F^-o(F^) / 2 q\ 



in polynomial time, unless 

NP C ZPTIME(2®(’'/-^+^+’')) = ZPTIME(2®('°8"('°s'°s (21) 

Now, we want to express this ratio in terms of N . If we insert Eq. [E|in Eq. EDI 
we get that 



j^l — 2Ftj — o{F^j) _ jy-1— 4/i/fc+o(l/fc) 

and if we insert Eq. El into this we get that 

j^l-2F„-o{F^) _ jyl-4/ i/CQ %/log log n+o(l/ Vlog log n) 

Since Viog log N = ^log logn(l + o(l)). 



?(|\/loglog n'^ = o^-\/loglogn^ 



and 



1 



\/log log N \/log log n l + o(l) 

^ =( 1 - 0 ( 1 )) 



Vlog log n 
1 



— o 



-v/log log n V \/log log ^ 



1 



Thus, 



^l-2F^-o{F^) _ ^1-Ci/Vlog log Af-o(l/Vlog log N) 

where C\ = ^jvcQ. 



( 22 ) 



(23) 



(24) 



(25) 



(26) 
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Note that we do not gain anything if we use Theorem 0 instead of Theorem El 
In the former case we get 



. 2k + 0{l) 
^ ~ fc2 + 0(l) 



l + o{l/k). 



(27) 



and to get a reasonable value for r, we need to set k"^ = O(loglogn). Thus 
we get the same hardness result, except for the constant, but with a stronger 
assumption — NP 2 BPTIME(-) instead of NP ^ ZPTIME(-) — if we use 
Theorem El 



4 Future Work 



An obvious way to improve this result would be to weaken the assumptions 
on NP we used in our hardness result. Best of all, of course, would be to construct 
deterministic reductions, since this would allow us to replace the probabilistic 
complexity classes with deterministic ones in all our assumptions on NP. Until 
this is done, an interesting open question is to determine the best definition of 
the amortized free bit complexity. We have proposed that the definition should 
be 



/ + logc ^ 
log(c/s) 



(28) 



This definition works well in the sense that a PCP with one-sided error gives a 
hardness result for Max Clique under the assumption that NP-complete prob- 
lems cannot be decided with one-sided error in probabilistic polynomial time, 
and similarly a PCP with two-sided error gives a hardness result for Max Clique 
under the assumption that NP-complete problems cannot be decided with two- 
sided error in probabilistic polynomial time. 

However, we have seen in Theorem|S|that if one wants to use a PCP with two- 
sided error to obtain hardness results under the assumption that NP-complete 
problems cannot be decided with one-sided error in probabilistic polynomial 
time, the interesting parameter is (close to) Fc, defined in Eq. 0 To establish 
whether it is possible to improve this to our proposed definition of /, or if Fc is 
the best possible in this case is an interesting open question. 

Trying to obtain an upper bound is also interesting, especially since it is 
currently unknown how well the Lovasz "d-function approximates Max Clique. 
Feige p] has shown that it cannot approximate Max Clique within 
but, in light of Hastad’s results E] and the results of this paper, this does not 
compromise the Lovasz d- function. It may very well be that it beats the combi- 
natorial algorithm of Boppana and Halldorsson El- 
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Abstract. The independence number of a graph and its chromatic num- 
ber are hard to approximate. It is known that, unless coRP = NP, there 
is no polynomial time algorithm which approximates any of these quan- 
tities within a factor of for graphs on n vertices. 

We show that the situation is significantly better for the average case. 
For every edge probability p = p{n) in the range < p < 3/4, 

we present an approximation algorithm for the independence number 
of graphs on n vertices, whose approximation ratio is 0{{np)^^^ / log n) 
and whose expected running time over the probability space G{n,p) is 
polynomial. An algorithm with similar features is described also for the 
chromatic number. 

A key ingredient in the analysis of both algorithms is a new large devia- 
tion inequality for eigenvalues of random matrices, obtained through an 
application of Talagrand’s inequality. 



1 Introduction 

An independent set in a graph G = {V,E) is a subset of vertices spanning no 
edges. The independence number of G, denoted by a(G), is the size of a largest 
independent set in G. A coloring of G is a partition V = Gi . . . Gfc of its vertex set 
V, in which every part (color class) Ci forms an independent set. The chromatic 
number x(G) of G is the minimal possible number of colors in a coloring of G. 

Independence number and chromatic number are essential notions in combi- 
natorics and the problem of estimating these parameters is central in both graph 
theory and theoretical computer science. Unfortunately, it turns out that both 
of these problems are notoriously difficult. Computing the exact value of a{G) 
or x(G) is known to be NP-hard since the seminal paper of Karp jl tij . 

Given these hardness results, one still hopes to approximate the above pa- 
rameters within a reasonable factor in polynomial time. For a number / > 1, we 
say that an algorithm A approximates the independence number within factor 
/ over graphs on n vertices, if for every such graph G A outputs an independent 
set /, whose size satisfies |/| > a{G)/f. Similarly, A approximates the chromatic 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 13-^^ 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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number within factor / for graphs on n vertices if for every such graph it outputs 
a coloring with the number of colors k satisfying k < /x(G). We refer the reader 
to a survey book m for a detailed discussion of approximation algorithms. 

However, recent results have shown that in the worst case both computational 
problems are also hard to approximate. Hastad m showed that unless coPR — 
NP, there is no approximation algorithm for the independence number whose 
approximation ratio over graphs on n vertices is less than for any fixed e > 0. 
Also, Feige and Kilian proved in the same hardness result for the chromatic 
number. In this paper we aim to show that the situation is significantly better 
when one considers the average case and not the worst case. When discussing 
the performance of an algorithm A in the average case, it is usually assumed 
that a probability distribution on the set of all inputs of A is defined. The most 
widely used probability measure on the set of all graphs on n vertices is the 
random graph G{n,p). For an integer n and a function 0 < p = p{n) < 1, the 
random graph G{n,p) is a graph on n labeled vertices 1, . . . , n, where each pair 
of vertices (f, j) is chosen to be an edge of G independently and with probability 
p. We say that a graph property P holds almost surely, or a.s. for brevity, in 
G{n,p) if the probability that a graph G, drawn according to the distribution 
G{n,p), has P tends to 1 as the number of vertices n tends to infinity. 

As with many other graph parameters, it turns out that the average case is 
much simpler to handle than the worst case for both independence and chromatic 
numbers. Bollobas and Luczak m showed that a. s. the chromatic number 
of G{n,p) satisfies x(G) = (1 + o(l))nlog 2 (l/(l — p))/ \ 0 g 2 n for a constant p, 
and x(G) = (1 + o(l))np/(21n(np)) for G/n < p{n) < o(l). It follows easily 
from these results that a.s. a{G{n,p)) = (1 — o(l)) log 2 n/ log 2 (l/(l — p)) for 
a constant p, and a{G{n,p) = (1 — o(l))21n(np)/p for G/n < p < o(l). Also, 
the greedy algorithm, coloring vertices of G one by one and picking each time 
the first available color for a current vertex, is known to produce a.s. in G{n,p) 
with p > a coloring whose number of colors is larger than the optimal one 
by only a constant factor (see Ch. 11 of the monograph of Bollobas 0). Hence 
the largest color class produced by the greedy algorithm is a.s. smaller than the 
independence number only by a constant factor. 

Note however that the above positive statement about the performance of 
the greedy algorithm hides in fact one quite significant point. While being very 
successful in approximating the independence/chromatic number for most of 
the graphs in G{n,p), the greedy algorithm may fail miserably for some ’’hard” 
graphs on n vertices. It is quite easy to fool the greedy algorithm by constructing 
an example of a graph on n vertices, for which the ratio of the number of colors 
used by the greedy algorithm and the chromatic number will be close to n. 
Moreover, it has been shown by Kucera HD that for any fixed e > 0 there exists 
a graph G on n vertices for which, even after a random permutation of vertices, 
the greedy algorithm produces a. s. a coloring in at least n/log 2 n colors, while 
the chromatic number of G is at most Thus, we cannot say that the greedy 
algorithm is always successful. 
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In contrast, here our goal is to develop approximation algorithms which will 
work relatively well for all graphs on n vertices and whose expected running time 
will be polynomial in n. Given an algorithm A, whose domain is the set of all 
graphs on n vertices, and a probability space G{n,p), the expected running time 
of A over G{n,p) is defined as J2g Pi"[G]Ra{G), where the sum runs over all la- 
beled graphs on n vertices, Pr[G] is the probability of G in G{n,p), and Ra{G) 
stands for the running time of A on G. Thus, while looking for an algorithm 
A whose expected running time is polynomial, we can allow A to spend a su- 
perpolynomial time on some graphs on n vertices, but it should be effective on 
average. 

The approach of devising deterministic algorithms with expected polynomial 
time over some probability spaces has been undertaken by quite a few papers. 
Some of them, e.g., 0, ISDI, [El discuss coloring algorithms with expected poly- 
nomial time. We wish to stress that the underlying probability spaces in all these 
papers are different from G{n,p). 

In this paper we present approximation algorithms for the independence num- 
ber and the chromatic number, whose expected running time is polynomial over 
the probability spaces G{n,p), when the edge probability p(n) is not too low. 
We have the following results. 

Theorem 1. For any constant e > 0 the following holds. If the edge probability 
p{n) satisfies < p{n) <3/4, then there exists a deterministic algorithm, 

approximating the independence number a{G) within a factor 0((np)^^^/ log n) 
and having polynomial expected running time over G{n,p). 

Theorem 2. For any constant e > 0 the following holds. If the edge probability 
p{n) satisfies < p{n) < 3/4, then there exists a deterministic algorithm, 

approximating the chromatic number \{G) within a factor 0{{np)^^^ / logn) and 
having polynomial expected running time over G{n,p). 

Thus, in the most basic case p = 1/2 we get approximation algorithms 
with approximation ratio 0(n^/^/logn) - a considerable improvement over best 
known algorithms for the worst case m, m. whose approximation ratio is only 
0{n/polylog{n)). Note also that the smaller the edge probability p(n), the better 
the approximation ratio is in both our results. 

Before turning to descriptions of our algorithms, we would like to say a 
few words about combinatorial ideas forming the basis of their analysis. As 
is typically the case with developing algorithms whose expected running time 
is polynomial, we will need to distinguish efficiently between ’’typical” graphs 
in the probability space G(n,p), for which it is relatively easy to provide a 
good approximation algorithm, and ” non-typical” ones, which are rare but may 
be hard for approximating a desired quantity. As these rare graphs will have 
an exponentially small probability in G(n,p), this will allow us to spend an 
exponential time on each of them. This in turn will enable to approximate the 
independence/chromatic number within the desired factor even for these graphs. 

A separation between typical and non-typical instances will be made based 
on the first eigenvalue of an auxiliary matrix, to be defined later. Thus we may 



16 



M. Krivelevich and V.H. Vu 



say that our algorithms exploit spectral properties of random graphs. Spectral 
techniques have proven very successful in many combinatorial algorithms. The 
ability to compute eigenvalues and eigenvectors of a matrix in polynomial time 
combined with understanding of the information provided by these parameters 
can constitute a very powerful tool, capable of solving algorithmic problems 
where all other methods failed. This is especially true for randomly generated 
graphs, several successful examples of spectral techniques are |H|, |H|’ A survey 

0 discusses several applications of spectral techniques to graph algorithms. 

In order to show that bad graphs have an exponentially small probability 
in G(n,p), we will prove a new large deviation result for eigenvalues of random 
symmetric matrices. This result, bounding the tails of the distribution of the first 
eigenvalue of a random symmetric matrix, is proven by applying the inequality 
of Talagrand 1221 and may be of an independent interest. 

The rest of the paper is organized as follows. In Section 2 we provide technical 
tools to be used in the the proof of correctness of our algorithms. In Section 3 we 
present an algorithm for approximating the independence number. In Section 4 
an algorithm for approximating the chromatic number is described. Section 5 is 
devoted to concluding remarks. 

We wish to note that our results are asymptotic in nature. Therefore all 
usual asymptotic assumptions apply. In particular, we assume n to be large 
enough whenever needed. We omit routinely all ceiling and floor signs. No serious 
attempt is made to optimize constants involved. 

2 Preliminaries 

In this section we prove technical results needed for the analysis of our approx- 
imation algorithms, to be proposed in the next two sections. In the first sub- 
section we analyze the performance of the greedy algorithm on random graphs, 
the second subsection is devoted to bounding the tails of the first eigenvalue of 
a random matrix. 



2.1 Greedy Algorithm on Random Graphs 

Given a graph G = (V, E) and some fixed ordering of its vertices, the greedy 
coloring algorithm proceeds by scanning vertices of G in the given order and 
assigning the first available color for a current vertex. The greedy algorithm 
has long been known to be a quite successful algorithm for almost all graphs 
in G{n,p), if the edge probability p is not too small. For our purposes, we need 
to prove that it is also extremely robust, i.e., uses an optimal up to a constant 
factor number of colors with probability extremely close to 1. We will also prove 
that the largest color class of the output of the greedy algorithm is of order of the 
independence number with even higher probability. Throughout this section we 
assume that the p{n) falls in the range of Theorems 1 1 121 i.e., satisfies < 

p(n) < 3/4 for some positive constant e. We fix the natural order of the n 
vertices, i.e., 1, 2, . . . , n. 
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Lemma 1. The probability in G{n,p) that the largest eolor class, produced by 
the greedy algorithm, has size less than lnn/(2p), is less than 2“". 

Proof. Set oq = ^ , t = We call a family C = {Ci, . . . , Ct} of t subsets 
of V bad if 

1. All Ci are pairwise disjoint; 

2. For every vertex v GV\ lJi=i and for every I < i <t, there is an edge 
of G connecting v and Cp, 

3. For every I <i <t, \Ci\ < oq- 

It is easy to see that if Ci Ct are the first t colors produced by the greedy 
algorithm, then the family C = {Ci,...,Ct} satisfies requirements 1, 2 above. 
Thus, if the greedy algorithm fails to produce a color class of size at least oq 
while running on a graph G, the first t colors of its output form a bad family in 
G. 

Fix a collection C = {Ci, . . . , Ct} with all Ci being pairwise disjoint and of 
size |Ct| < oq- a vertex v \ Ut=i C'* is stuck with respect to C if it satisfies 
condition 2 above. The probability that v is stuck is 

H(1 - (1 - < exp{- ^(1 - . 

i=l i=l 

As the events that different vertices outside C are stuck are mutually indepen- 
dent, we get 

t 

Pr[C bad] <exp{-f(l-p)“«|F\|JC,|} < = (1 + 

2=1 



Therefore the probability that G(n,p) contains a bad collection is at most 







<(1 + 0 ( 1 )) 




e-tC/V2 



< n"e“ 



p/2 



/2 < 






< 2 - 



□ 



Lemma 2. The probability in G{n,p) that the greedy algorithm uses at least 
4np/lnn colors it at most 

Proof. The proof presented here is essentially identical to the proof of Theorem 
11.14 of the monograph of Bollobas 0, where a somewhat stronger result is 
proven for the case of a constant p. 

Set ko = j^,ki = 2ko = Denote by the event that at least k colors 
are used by the greedy algorithm in coloring G{n,p). Let also, for k < j < n, Bj 
denote the event that vertex j gets color k. As obviously 
we get for all kg < k < ki 

n 

Pr[A'=+i|A'=] < • 

j=k+l 
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Let us estimate now Pr[B^'^^\A^] for j > k + 1. Suppose Ci, . . . ,Cfc are the 
color classes produced by the greedy algorithm before coloring vertex j. Then 

k 

Hence Pr[A^'^^\A^] < ^ = (l + o(l))ne“" ''o < 

We derive 



ki — 1 
k—ko 




2~ko 



□ 



2.2 Large Deviation Result 

In this subsection we present a new large deviation result, which will be needed 
in the analysis of the algorithms. This result is also of independent interest. The 
proof in this subsection will make use of the following powerful result, due to 
Talagrand m- 

Let ... Am be independent random variables and let S be the product 
space (with the product measure) generated by ti, ... Am- Fix a set S C S. For 
a non-negative number t, define Bt as follows 



n 

Bt = {x e 5|Va = (oi, . . . , am), 3y € B s.t. ^ |ai| < t(^ 

Xi^Ui i—1 



Then Talagrand ’s inequality gives: 

Pr\Bi]Pr[B] < 6"*'/^ . 



Given a graph G on n vertices and a number 0 < p < 1, we define a matrix 
M = M{G,p) = {rnij)2j^i follows: 

_ J 1, if are non-adjacent in G, .... 

\ —q/p, otherwise , ^ ' 

where q = 1 — p. Let Ai(M) > A2(M) > . . . > \n{M) denote the eigenvalues of 
M. 



Lemma 3. 

Pr[\i{M) > 4(n/p)i/2] < . 



Proof. First notice that M(G,p) is a random symmetric matrix which can be 
generated as follows. Consider ( 2 ) random variables rriij, 1 < i < j < n, where 
rriij = 1 with probability q, and —q/p with probability p. Set rriji = rriij and 
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mu = 1. This observation enables us to apply known results on eigenvalues of 
random matrices. Fiiredi and Komlos proved implicitly in m that 

/ \ 1/2 

E[X^{M)]=2^^j (1 + 0(1)) . 

Thus we need to bound the probability of deviation of the first eigenvalue from 
its mean. Denote by m the median of Ai(M) and set B to be the set of all 
matrices M with Ai(M) < m. Clearly, Pr[B] = 1/2. Assume that for a positive 
t, a matrix satisfies Ai(M°) > m + t. Then by Courant-Fisher’s theorem, 
there is a vector x = (a;i,...,x„)s i?” with norm 1 such that 

n 

TO + t < x*M°x = ^ 2xjXjTO°- + ^ XjjTO° = 1 + ^ 2xiXjm^j . 

On the other hand, for any matrix G B we have 

n 

TO > x‘M^x = ^ 2xiXjm}j +''^Xumu = 1 + ^ 2xiXjm\j . 

l<2<j'<n 2=1 

It follows that 

^ 2xiXj{m°j - m}^) > t . 

l<2<_j<n 

Set ttij = 2xiXj for 1 < i < j < n. Since x has norm 1, we get J2i<i<j<n — 
2(Sr=i — 2- Moreover, since — mjj\ < 1 + q/p = 1/p, 

Y. la.,l>tp>^{ Y 41 '" ■ 

tj 

This implies that G Talagrand’s inequality 

Pr[Ai(M) > TO + 1] < Pr{B,^i^^ < ■ 

Given that Pr[B] = 1/2, it follows that 

Pr[Ai(M) > TO + t] < 

Now set B = {M|Ai(M) < to — t}. By a similar argument, we can show that if 
Ai(M°) > TO, then G This, again by Talagrand’s inequality, yields 

Pr[Ai(M) < TO — t] < 



Together, we have 



Pr[|Ai(M) — to| > t] < 4e 
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or equivalently 

Pr[\\i{M) — m\ > Vst/p] < 4e“* , 

The problem here is that the median m is not exactly the mean. However, 
from what we have already proved, it is simple to show that they differ only by 
0{l/p). Indeed, let X = Ai(M), Y = pX, let also rrii = pm be the median of 
Y. Then we have Pr[\mi) — Y\ > t] < 4e“* Therefore, 

nOC nOO 

\mi — E[Y]\ < E[\mi — Y\] < / tPr[\Y — mi\ > t]dt < / 4te“‘ = 16 , 

Jo Jo 

implying |if [^] — m| < 16/p. Recalling our assumption about p{n), the claim of 
the lemma follows directly. □ 

What is the connection between Ai (M(G)) and a{G)7 The answer is given 
by the following simple lemma. 

Lemma 4. Let M = M{G,p) be as defined in (EJ). Then \\{M) > a{G). 

Proof. Let k = a{G). Then M contains a fc by fc block of all I’s, indexed 
by the vertices of an independent set of size k. It follows from interlacing that 
Ai(M) > Ai(lfexfc) = fc. □ 

The reader has possibly noticed that Ai (M(G)) is an upper bound not only 
for the independence number of G, but for its Lovasz Theta-function (CHI)- 
Therefore our Lemma |5] provides in fact a large deviation result also for the 
Theta-function. We get the following bound: 

Lemma 5. Let p = p{n) satisfy p{n) = to{l)/n. Then in G{n,p) 

Pr[a{G) > 4(n/p)i/2] < Pr[e{G) < A{n/pf/^\ < . 

3 Approximating the Independence Number 

We are now in position to present both of our approximation algorithms. In this 
section we describe an algorithm for approximating the independence number, 
while an algorithm for the chromatic number is described in the next section. 
We assume that the algorithm is given a graph G on n vertices and the value of 
the edge probability pin). For a subset W C R we denote N{W) = {v &V\W : 
'iw&Wfiv^w) ^EiG)}. 

Step 1. Run the greedy algorithm on G. Let / be a largest color class produced 
by the greedy algorithm. If |/| < lnn/(2p), goto Step 5; 

Step 2. Define matrix M = M{G,p) as given by (CQ. Compute Ai(M). If 
Ai(M) < 4(n/p)^/^, output /; 

Step 3. For each W C V of size |IF| = Inn/p, compute |iV(VF)|. If for no W, 
\NiW)\ > output /; 

Step 4. Check all subsets of V of size ifinlpfil'^ . If none of them is independent, 
output /; 

Step 5. Check all subsets of V and set I to be an independent set of the largest 
size. Output I. 
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Let us first check that the output always approximates cr(G) within a factor of 
0((np)^G/ log n). If an independent set is output at Step 2, its size satisfies |/| > 
lnn/(2p), and due to Lemma0we know that a{G) < Xi{M) < ■ Hence 

the approximation ratio for this case is 0((n/p)^/^/(ln n/p)) = 0((np)^G/ in n). 
If I is output at Step 3, then G does not contain an independent set of size 
(n/p)^/^+lnn/p < 2{nj'pYl’^ ^ since no W has many non-neighbors. If I is output 
at Step 4, we get that a{G) < (2n/p)^/^, thus giving the desired approximation 
ratio. Finally, if the output is produced at Step 5, it is the result of the exhaustive 
search over all subsets of V, and thus its size is equal to a{G). 

Now it remains to prove that the expected running time of the algorithm 
is polynomial. We assume that the exhaustive search has running time 2" (in 
fact better exponential bound is known, but we do not need it). Notice that 
eigenvalues of an n by n matrix are computable in time polynomial in n. Clearly, 
the cost of performing Steps 1 and 2 of the above algorithm is polynomial. The 
only chance to get to Step 3 is to have a graph G with \i{M{G,p)) > 4(n/p)^/^. 
The probability of this event is at most by Lemma 0 The complexity of 

Step 3 is Therefore the expected amount of calculations performed 

at Step 3 is of order at most 




2~np/8 ^ 



/enp\in«/p 

Vlnn/ 



2-"p/8 = ^ 



due to our assumption on p{n). Similarly, we get to Step 4 only if there exists a 
set W of size |kF| = Inn/p with N{W) > {njpy^'^. The probability of this event 
in G{n,p) is at most 




n 

(n/pYG 



(1 -P) 



(lnn/p)(n/p) 



1/2 



= o( 



n 

{n/pYG 



-1 



As executing Step 4 requires ((n/pp/ 2 ) operations, the expected number of oper- 
ations performed at Step 4 is o(l). Finally, we get to Step 5 if either the greedy 
algorithm outputs no color class of size at least lnn/(2p) (and the probability 
of this event is at most 2“" by Lemma 0) or if G contains an independent set 
of size {tiIpY^'^, this happens with probability at most 



((2n/p)i/2) 



(1 -p)( 



(2n/p)l/2 



) = 



,( 2 -") 



Thus the expected complexity of Step 5 is also o(l). We can conclude that the 
expected running time of the above algorithm over G{n,p) is dominated by the 
cost of performing its first two steps (in fact. Step 2) and is therefore polynomial 
in n. This proves Theorem 0 



4 Approximating the Chromatic Number 

In this section we present an approximation algorithm for the chromatic number. 
We assume again that the algorithm is given a graph G on n vertices and the 
value of p as an input. 
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Step 1. Run the greedy algorithm on G. Let Ci be the resulting coloring. If the 
number of colors in Ci is at least 4np/ In n goto Step 5; 

Step 2. Define M = M{G,p) according to 0 and compute Ai(M). If Ai(M) < 
4(n/p)^/^, output Cl, 

Step 3. If the number of vertices of G of degree at least 4np exceeds np goto 
Step 7. Otherwise, color G by first coloring each vertex of degree at least 4np 
by a separate color, and then running the greedy algorithm on the rest of the 
graph and using fresh colors. Let C 2 denote the obtained coloring. 

Step 4. For each W C V of size \W\ = Inn/p, compute |7V(VF)|. If for no W, 
|-A^(M^)| > !{p^/'^h\n) output C 2 ; 

Step 5. Check all subsets of V of size Inn). If none of them is 

independent, output C 2 ; 

Step 6. Check whether there exist In^ n pairwise disjoint independent sets of 
size Inn) each. If there is no such collection output C 2 ; 

Step 7. Find an optimal coloring by the exhaustive search and output it. 

As the reader has possibly noticed, the above algorithm is quite similar to 
that of Section 0 The algorithm of this section is however somewhat more com- 
plicated. This distinction is caused by the fact that the bound of Lemma 0 is 
much stronger than that of Lemma 0 

Let us verify that the above algorithm approximates x(C) within a factor of 
0((np)^/^/lnn). If coloring C\ is output at Step 2, we get by LemmaSx(G) > 
n/a{G) > n/\i{M) > (np)^/^/4. On the other hand, Ci has at most 4np/lnn 
colors. Thus in this case the approximation ratio is 0((np)^/^/lnn). Observe 
that if G has at most np vertices of degree at least 4np, then the coloring C 2 , 
produced at Step 3, uses at most np + 4np = 5np colors. If C 2 is output at Step 
4, then a{G) = Inn)) and hence the approximation ratio in this 

case is 0(np/(n^/^p^/^ Inn)) = 0((np)^/^/ In n) as well. An identical argument 
works for Step 5. If C 2 is output at Step 6, we claim that x(G) = I7((np)^/^ In n). 
Indeed, let V = (Gi, . . . , Gk) be an optimal coloring of G. Let fcp be the number 
of color classes of size at least Inn), then fcp < In'^n. Also, all color 

classes are of size less than n^/^ In^ n/p^/^. Thus the ko large color classes cover 
altogether at most n^/^ In^ n/p^/^ <C n vertices. The rest of the vertices are 
covered hy k — kg color classes, each of size less than n^^^/(p^/^ Inn), implying 
A: > fc — fco > (1 — o(l))n/(n^/^/(p^/^ Inn)) = (1 — o(l))(np)^/^ In n. Recalling 
that C 2 has 0{np) colors, we get the desired approximation ratio. Finally, if we 
ever get to Step 7, the output is found by the exhaustive search and is thus 
optimal. 

The expected running time of the above algorithm can be shown to be poly- 
nomial in n similarly to the algorithm for the independence number. The only 
notable difference is that here we use Lemma|2 We omit detailed calculations. 

5 Concluding Remarks 

In this paper we presented approximation algorithms for the independence num- 
ber and the chromatic number of a graph. These algorithms were designed as 
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to be efficient over the probability space G{n,p) of random graphs for various 
values of p = p{n). For every p{n) in the range < p{n) < 0.75, both 

our algorithms always achieve approximation ratio 0((np)^^'^ /log(n)) and their 
average running time is polynomial over G(n,p). 

How good are our results? As stated in the introduction, their approxima- 
tion ratio is much better than that of the best known algorithms for the worst 
case. Still, the greedy algorithm, one of the simplest possible algorithms for find- 
ing an independent set or a coloring, performs much better for a typical graph 
than what is guaranteed by our approximation algorithms. There is some in- 
dication, however, that the approximation ratio 0((np)^/^/logn) may be hard 
to improve. Consider, for example, the basic case p = 1/2. Saks suggested 
the following interesting problem. Suppose G is a graph on n vertices which 
has been generated either according to the distribution G(n, 1/2) or according 
to the following distribution: choose first a random graph G(n, 1/2) and then 
pick randomly a subset Q of size k and force it to be independent by erasing all 
edges inside Q. We denote the last model of random graphs by G(n, 1/2, k). The 
problem is to distinguish in polynomial time between the above two models. For 
the case k = 0(n^/^), Alon, Krivelevich and Sudakov jS] showed how to recover 
the independent set of size k in G(n, l/2,fc), using spectral techniques, thus 
clearly providing a tool for distinguishing between G(n, 1/2) and G(n, 1/2, fc). 
(See also for a related result.) However, Saks’ question is still open for every 
k = o(n^/^). Returning to our problem of developing efficient approximation 
algorithms, note that if we are unable to distinguish between G(n, 1/2) and 
G{n, 1/2, k) in polynomial time when k = o(n^/^), our algorithm should act 
the same for both models. As the independence number is a.s. of order Inn in 
the first model and is a.s. k in the second one, we cannot hope then to get an 
algorithm with approximation ratio better than k/lnn. This argument shows 
that the question of Saks and the problem of developing good approximation 
algorithms for the independence/chromatic number may be tightly connected. 

An obvious open question is what can be done for smaller values of p, i.e., for 
p <C . While our large deviation result (Lemma|3) keeps working for smaller 
values of p as well. Step 3 of our algorithm for approximating the independence 
number does not have polynomial expected time anymore (as 
tends exponentially fast to infinity for p <C Using again the greedy 

algorithm as our main tool, we can give an algorithm whose approximation ratio 
is 0(np). We conjecture however that much better approximation algorithms 
exist for sparse random graphs. 

Spectral techniques combined with large deviation inequalities have played 
an essential role in both of our algorithms. It appears that this machinery can be 
used successfully to develop approximation algorithms with expected polynomial 
time for other hard combinatorial problems as well. One possible candidate is the 
problem of approximating the value of a maximum cut in a graph (MAXCUT). 
We hope that the ideas of this paper can be used to devise algorithms, finding 
almost optimal cuts in expected polynomial time for various values of the edge 
probability p. 
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Abstract. Safely adding computational effects to a multi-stage lan- 
guage has been an open problem. In previous work, a closed type con- 
structor was used to provide a safe mechanism for executing dynamically 
generated code. This paper proposes a general notion of closed type as a 
simple approach to safely introducing computational effects into multi- 
stage languages. We demonstrate this approach formally in a core lan- 
guage called Mini-ML®f . This core language combines safely multi-stage 
constructs and ML-style references. In addition to incorporating state, 
Mini-ML®f also embodies a number of technical improvements over pre- 
viously proposed core languages for multi-stage programming. 



1 Introduction 



Many important software applications require the manipulation of open code 
at run-time. Examples of such applications include high-level program genera- 
tion, compilation, and partial evaluation j,ICSfl,‘I| . But having a notion of values 
that includes open code (that is, possibly containing free variables) complicates 
both the (untyped) operational semantics and type systems for programming 
languages designed to support such applications. This paper advocates a simple 
and direct approach for safely adding computational effects into languages that 
manipulate open code. The approach capitalises on a single type constructor 
that guarantees that a given term will evaluate to a closed value at run-time. 
We demonstrate our approach in the case of ML-style references |lVITHMfT7j . 

We extend recent studies into the semantics and type systems for multi-level 
and multi-stage languages. Multi-level languages l(j,l9ll(il96lMog9TO^ 
provide a mechanism for constructing and combining open code. Multi-stage 
languages |'TS97IT BS98IMT BS99IBM'TS99l'Tah99ITah()()| extend multi-level lan- 
guages with a construct for executing the code generated at run-time. Multi-stage 
programming can be illustrated using MetaML 



an extension of 

SML |M'THM!?7) with a type constructor (_) for open code. MetaML provides 
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-| datatype nat = z I s of nat ; (* natural numbers*) 

datatype nat 

-| fun p z X y = (y := 1.0) (* conventional program *) 

I p (s n) X y = (p n X y; y := X * !y); 
val p = fn : nat -> real -> real ref -> unit 

-| fun p_a z X y = <~y := 1.0> (* annotated program *) 

I p_a (s n) X y = <~(p_a n x y) ; ~j:=~x * !~y>; 
val p_a = fn : nat -> <real> -> <real ref> -> <unit> 

-| val p_cg = (* code generator *) 

fn n => <fn x y => ~(p_a n <x> <y>)>; 
val p_cg = fn : nat -> <real -> real ref -> unit> 

-| val p_sc = p_cg 3; (* specialised code *) 

val p_sc = <fn X y => (y:=1.0; y:=x*!y; y:=x*!y; y:=x*!y)> 

: <real -> real ref -> unit> 

-| val p_sp = run p_sc; (* specialised program *) 

val p_sp = fn : real -> real ref -> unit 



Fig. 1. Example of multi-stage programming with references in MetaML 



three basic staging constructs that operate on this type: Brackets (_), Escape 

_ and Run run Brackets defers the computation of its argument; Escape 
splices its argument into the body of surrounding Brackets; and Run executes 
its argument. 

Figure [Dlists a sequence of declarations illustrating the multi-stage program- 
ming method rrS97IBMTS99] in an imperative setting: 

— p is a conventional “single-stage” program, which takes a natural n, a real 
X, a reference y, and stores in y. 

— p_a is a “two-stage” annotated version of p, which requires the natural n 
(as before), but uses only symbolic representations for the real x and the 
reference y. p_a builds a representation of the desired computation. When 
the first argument is zero, no assignment is performed, instead a piece of 
code for performing an assignment at a later time is generated. When the 
first argument is greater than zero, code is generated for performing an as- 
signment at a later time, and moreover the recursive call to p_a is performed 
so that the whole code-generation is performed in full. 

— p_cg is the code generator. Given a natural number, the code generator 
proceeds by building a piece of code that contains a lambda abstraction, and 
then using Escape performs an unfolding of the annotated program p_a over 
the “dummy variables” <x> and <y>. This powerful capability of “evaluation 
under lambda” is an essential feature of multi-stage programming languages. 
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— p_sc is the specialised code generated by applying p_cg to a particular nat- 
ural number (in this case 3). The generated (high-level) code corresponds 
closely to machine code, and should compile into a light-weight subroutine. 

— p_sp is the specialised program, the ultimate goal of run-time code genera- 
tion. The function p_sp is a specialised version of p applied to 3, which does 
not have unnecessary run-time overheads. 



Problem Safely adding computational effects to multi-stage languages has been 
an open problerr0. For example, when adding ML-style references to a multi- 
stage language like MetaML, one can have that “dynamically bound” variables 
go out of the scope of their binder iniSIii . Consider the following MetaML^ 
session: 

-| val a = ref <1>; 
val a = . . . : ref <int> 

-| val b = <fn X => ~(a:=<x>; <2>)>; 
val b = <fn X => 2> : <int -> int> 

-| val c = !a; 
val c = <x> : <int> 

In evaluating the second declaration, the variable x goes outside the scope of 
the binding lambda, and the result of the third line is wrong, since x is not 
bound in the environment, even though the session is well- typed according to 
naive extensions of previously proposed type systems for MetaML. This form of 
scope extrusion is specific to multi-level and multi-stage languages, and it does 
not arise in traditional programming languages, where evaluation is generally 
restricted to closed terms (e.g. see FFT5| and many subsequent studies.) The 
the problem lies in the run-time interaction between free variables and references. 



Remark 1. In the type system we propose (see Figure El the above session is 
not well-typed. First, ref <1> cannot be typed, because <1> is not of a closed 
type. Second, if we add some closedness annotation to make the first line well- 
typed, i.e. val a = ref [<1>], then the type of a becomes ref [<int>], and 
we can no longer type a:=<x> in the third line. Now, there is no way to add 
closedness annotations, e.g. a: = [<x>], to make the third line well-typed, in fact 
the (close)-rule is not applicable to derive a: ref nat°; x\ nat ih[(x)]:[(nat)]°. 



Contributions and organisation of this paper This paper shows that multi- 
stage and imperative features can be combined safely in the same programming 

^ The current release of MetaML mini is a substantial language, supporting most 
features of SML and a host of novel meta-programming constructs. In this release, 
safety is not guaranteed for meta-programs that use Run or effects. We hope to 
incorporate the ideas presented in this paper into the next MetaML release. 

^ The observation made here also applies to A® pi )av9bj . 
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language. We demonstrate this formally using a core language, that we call 
Mini-ML^f , which extends Mini- ML [KIDDKS^ with ML-style references ancH 






— A code type constructor (_) 

— A closed type constructor [_] |ijM'rSt)T?| . but with improved syntax borrowed 
from A'^ [DE9b| . 

— A term construct run _ PSE] typed with [_]. 



The key technical result is type safety for Mini-ML^f , i.e. evaluation of well- 
typed programs does not raise an error (see Theorem ^ . The type system of 
Mini-ML^f is simpler than some related systems for binding-time analysis (BTA), 
and it is also more expressive than most proposals for such systems (Section • 
In principle the additional features of Mini-ML®f should not prevent us from 
writing programs like those in normal imperative languages. This can be demon- 
strated by giving an embedding of Mini-ML,.ef into our language, omitted for 
brevity. We expect the simple approach of using closed types to work in relation 
to other computational effects, for example: only closed values can be packaged 
with exceptions, only closed values can be communicated between processes. 



Note on Previous Work The results presented here are a significant general- 
isation of a recently proposed solution to the problem of assigning a sound type 
to Run. The naive typing run : (t) — >• t of Run is unsound (see j'l’HS98j ). since it 
allows to execute an arbitrary piece of code, including “dummy variables” such 
as <x>. The closed type constructor [_] proposed in jHMTSt)!l| allows to give 
a sound typing run : [(f)] — )> t for Run, since one can guarantee that values of 
type [t] will be closed. In this paper, we generalise this property of the closed 
type constructor to a bigger set of types, that we call closed types, and we 
also exploit these types to avoid the scope extrusion problem in the setting of 
imperative multi-stage programming. 

2 Mini-MLf^'^ 

This section describes the syntax, type system and operational semantics of 
Mini-ML^f , and establishes safety of well-typed programs. The types r and closed 
types (T are defined as 

T G T: : = (7 I Ti — >• T2 I (r) a G C::— nat | [t] | ref a 

Intuitively, a term can only be assigned a closed type a when it will evalu- 
ate to a closed value (see Lemma 0. Values of type [t] are always closed, 
but relying only on the close type constructor makes programming verbose 
flVLT'IjS9t)IIjMTlSH9ll’ah()(?] . The generalised notion of closed type greatly im- 
proves the usability of the language (see Section L!.3I) . The set of Mini-ML^f 

® Mini-ML®f can incorporate also MetaML’s cross-stage persistence |'rSD7| . This can 
be done by adding an up, similar to that of A^^ [M'rHSQTH . and by introducing a 
demotion operation. This development is omitted for space reasons. 
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terms is parametric in an infinite set of variables x £ X and an infinite set of 
locations I £ L 

e £ E: : = X I Ax.e | ei 62 | fix x.e | z | s e | (case e of z — ei | s x — >■ 62) | 

(e) I ~e I run e I [e] | (let [x] = ci in 62) | 
ref e I ! e I ei: = 62 I Z I fault 

The first line lists the Mini-ML terms: variables, abstraction, application, fix- 
point for recursive definitions, zero, successor, and case-analysis on natural num- 
bers. The second line lists the three multi-stage constructs of MetaML HHnil: 
Brackets (e) and Escape ~ e are for building and splicing code, and Run is for ex- 
ecuting code. The second line also lists the two “closedness annotations”: Close 
[e] is for marking a term as being closed, and Let- Close is for forgetting these 
markings. The third line lists the three SML operations on references, constants 
I for locations, and a constant fault for a program that crashes. The constants I 
and fault are not allowed in user-defined programs, but they are instrumental to 
the operational semantics of Mini-ML^f . 

Remark 2. Realistic implementations should erase closedness annotations, by 
mapping [e] to e and (let [x] = ei in 62) to (let x = ei in 62). 

The constant fault is used in the rules for symbolic evaluation of binders, 

n+l n+l 

pL,e ^ fx ,v pL,e ^ jjL ,v 

e.g. we write instead of . 

n+l ,r , I , , , n+l 

/i, Ax.e ^ /r [x: = fault]. Ax. X /r, Ax.e ^ /i , Ax.u 

This more hygienic handling of scope extrusion is compatible with the identifi- 
cation of terms modulo a-conversion, and prevents new free variable to appear 
as effect of the evaluation (see Lemma EI) . On the other hand, in implementa- 
tions there is no need to use the more hygienic rules, because during evaluation 
of a well-typed program (starting from the empty store) only closed values get 
stored. 

Note 1. We will use the following notation and terminology 

— Term equivalence, written =, is a-conversion. Substitution of e for x in e' 
(modulo =) is written e'[x: = e]. 

— m,n range over the set N of natural numbers. Furthermore, m £ N is iden- 
tified with the set {* £ N|f < m} of its predecessors. 

f 2/Th 

— f: A -+ B means that / is a partial function from A to B with a finite 
domain, written dom{f). 

— A 7 : L — >■ T is a signature (for locations only), written {1+ ref ai\i £ m}. 

J" 2Th 

— A,E:X -+ (T X N) are type-and-level assignments, written {xi:Tf’|* £ m}. 
We use the following operations on type-and-level assignments: 

{xi: r "‘ \i £ m}+” = {xp. £ m} adds n to the level of the x^; 

{xi'. \i £ m}-" = {xj: r ”* \ui < nAi £ m} removes the x^ with level > n. 

— /i: L E is a store. 

— E, 1: ref a, E, x: r" and p,{l = e} denote extension of a signature, assignment 
and store respectively. 
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s,A-,r\- x-.t’ 

E, A- r, x: rf h e: ra" 

E,A;r\- \x.e\ Ti — >■ T2 

, E,A-r,x-T^^e-.T 
(fix. 



A{x) = T 



r{x) = t" 



E,A-,r\- x-.t"" 

E,A-rh T2" r,Z\;rhe2:ri" 

E,A-r^ei 62 : T 2 " 

E, A’,r \- e: nat" 



(case' 



E,A-r\-f\x x.e-.r^ 
r, Zi; r h e: nat" 



r, Zi; r h z: nat" T, zi; E h s e: nat" 

E,A-r\- ei:r" T, Zi, a:: nat"; E h 62; r" 



E, Z\; E h (case e of z — >■ ei | s a; — >■ 62): r" 
E,/i;Eh e:r"+i E, ZV; E h e: (t)" E, Z\; E h e: [(t)]" 



(dose) 



E,Z\;Eh (e):(r)" 

r,Zi^’";0he:T" 



E,Z\;EI- run e;r" 
E, Zi; E h ei: [n]" E, Z\, a;: rf ; E h 62: T2" 



E,Z\;Eh [e]:[r]- 

E,A-,r\- e:cr” 



E, Z\; E h (let [a;] = ei in 62): t2^ 
E,A-r\- e: ref o’* 



(set) 



r, Z\; E h ref e: ref a" E, Z\; E h ! e: a" 
E,ZV;EI- ei:ref cr" E, ZV; E h 62: a" 



(fix*) 



E,Z\;EI- ei:=e2:ref cr" 
r,Zi^",a::T";0 h e:r" 



E, Zi; E h fix a;.e: t” 



(close*) 



E, Z\; E h 1 : ref cr^ 

E, A: r \- e: a 



E{1) = ref a 



E,A-,rh[e]-.[aY 



Fig. 2 . Type System for Mini-ML^f 



2.1 Type System 

Figure Ogives the rules for the type system of Mini-ML^f . A typing judgement 
has the form E,A;F h e:r"', read “e has type t and level n in E,A',r”. E 
gives the type of locations which can be used in e, A and E (must have disjoint 
domains and) give the type and level of variables which may occur free in e. 

Remark 3. Splitting the context into two parts {A and E) is borrowed from 
A° fPPflfij . and allows us to replace the cumbersome closedness annotation 
(close e with {xi = ei\i € m}) of |BMTS99j with the more convenient 
[e] and (let [x] = ei in 62 ). Informally, a variable x: t" declared in E ranges over 
values of type t at level n (see Definition [IJ , while a variable x: r" declared in 
A ranges over closed values (i.e. without free variables) of type r at level n. 

Most typing rules are similar to those for related languages |Dav96IBMTS9~^ . 
but there are some notable exceptions: 

— (close) is the standard rule for [e], the restricted context A7, Z\-";0 in the 
premise prevents [e] to depend on variables declared in E (like in A° pP96| ) 
or variables of level > n. The stronger rule (close*) applies only to closed 
types, and it is justified in Remark O 

— (fix) is the standard rule for fix x.e, while (fix*) makes a stronger assumption 
on X, and thus can type recursive definitions (e.g. of closed functions) that 
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are not typable with (fix). For instance, from 0 ; /': [ti — >■ T2]", x: r" h e: T2" 
we cannot derive fix f\[Xx.e\: [ti — >■ T2]", while the following modified term 
fix /'.(let [/] = /' in [Xx.e[f':= [/]]]) has the right type, but the wrong 
behaviour (it diverges!). On the other hand, the stronger rule (fix*) allows 
to type [fix f.Xx.e[f':= [/]]], which has the desired operational behaviour. 

— There is a weafcer variant of (case*), which we ignore, where the assumption 
x: nat" is in F instead of A. 

— (set) does not assign to Ci: = 62 type unit, simply to avoid adding a unit type 
to Mini-Mip^^. 

The type system enjoys the following basic properties: 

Lemma 1 (Weakening). 

1 . If F,A;r\- e: T2" and x fresh, then S, A; T, x: h e: T2” 

2 . If F, A;F \- e: T2" and x fresh, then S, A, x: The: T2" 

3 . If F, A;F \- e: T2" and I fresh, then F, 1 : ref ai,A;F\- e: T2™ 

Proof. Part 1 is proved by induction on the derivation of F, A; F \- e: ■ The 

other two parts are proved similarly. 

Lemma 2 (Substitution). 

1 . If F, A; F \- e: ri™ and F, A; F, x: h e': T2", then F,A]F'r e'[x\ = e]: T2" 

2 . If F, Z\-™; 0 h e: ti"* andF, A, x: F h e': T2", thenF, A;F\- F\x\ =e]: T2" 

Proof. Part 1 is proved by induction on the derivation of A;F,x:Tf^ h e'\T2^. 
Part 2 is proved similarly. 

2.2 CBV Operational Semantics 

Figure El gives the evaluation rules for the call-by-value (CBV) operational se- 
mantics of Mini-ML^f . Evaluation of a term e at level n can lead to 

— a result v and a new store /i', when we can derive /i, e /i', v, 

— a run-time error., when we can derive /r, e ^ err, or 

— divergence, when the search for a derivation goes into an infinite regress. 

We will show that the second case (error) does not occur for well-typed pro- 
grams (see Theorem^. In general v ranges over terms, but under appropriate 
assumptions on /i, v could be restricted to value at level n. 

Definition 1. We define the set V" C E 0 / values at level n by the BNF 
vO g \/0. . _ ^2;.e I z I s I (v^) I [u°] I I 

yn-i-l g yra-|-l.._ ^ | \x,v'^+^ | Vi~^^V2^^ \ f\X X.v'^~^^ \ 

z I s I (case of z — )> | s a; — >■ vlf^^) \ 

(.yrt+2) I I I (lei; in u"+^) I 

ref I ! | u”+^:= | / | fault 



32 



C. Calcagno, E. Moggi, and W. Taha 



Normal Evaluation 

We give an exhaustive set of rules for evaluation of terms e € E at level 0 
/r, x^err /r, ei ^ /r , Ax.e /i,e2^/r fJ. , e[a;: = w] ^ /r ,v 



p, ei 62 ^ 



fi, Xx.e ^ fi, \x.e 

fj,,ei ^ fi' ,v ^ Xx.e fi,e[x: = f\xx.e] ^ fi',v o /r, eA/i',u 

/r, z /i, z 



0 r- 0 / 

/i, 6i 62 err /i, tix *.6 fi ,v 

/i, e^/i,z 

0 



0 / 

fj.,s e ^ ,s V 



^ ^ I / 



fx, (case 6 of z — >• 6i | s a; ^ 62) ^ fx” , v /x, (case e of z — >• 61 | s a: — >• 62) ^ err 
/r, eA/i',su /r', 62(3:: = w] A /r", v' 



/i, (case 6 of z — >• 61 | s a: — >■ 62) ^ /a”, n' 



fi,e^ ,v 0 

fi, e ^ err 

(e) ^ n', (v) 



fx,e^ fx',[{v)] n',v^ix”,v' /a,6 A p',n ^ [(e')] /i,eA/r',n 



fx, run 6 ^ /a", w' 



° r 1 ° ' r 1 

/a, run 6 ^ err /a, [ 6 j ^ /a , [wj 



61 A /a', [n] /a',62[a:: = u] A p",u' 

fx, (let [x] = 6i in 62) A /a", v' 



fx,ei ^ n',v^ [6] 
fx, (let [x] = 61 in 62) A err 



jx,e ^ jx ,v 



/a, ref e ^ fx'{l = w}, I 



/ 7 

, ix.e ^ IX A 

I 0 dom{fx ) /a (Z) = n 



ix,l e ^ fx ,v 



/a, 6 A /a', n ^ Z € dom{fx') /a, 61 A /a', Z /a', 62 A /a", n 



/a, ! 6 ^ err 



/a, 61; = 62 ^ /a"{Z = u}, Z 



/a, 61 ^ /a', n ^ Z G dom{fx') 0 



0 

/a, 61 : = 62 err 



fxA ^ IxA /a, fault err 



Symbolic Evaluation 

fi,e ^ fi ,{v) fi,e ^ fi ^ {e } fi,e ^ (i ^ ,v 



/i, e ,v 



/i, e ^ err (e) ^ /i , (i;) e ^ , v 



In all other cases symbolic evaluation is applied to the immediate sub-terms from left to 

n-l-l . , n-(-l ,, 

/i, 6i ^ /i,Ul fl ,62 ^ fJx ,V 2 

right without changing level and bound variables 



n-\-l n 

II, 6i 62 ^ ^ ,Vi V2 



that have l 6 aked in the store are replaced by fault 



n-l-l f 

fl,6 ^ ^ ,V 



71 ] 1 

/a, Ax . 6 ^ /a'[x: = fault], Ax. u 



Error Propagation 



For space reasons, we omit the rules for error propagation. These rules follow the 
ML-convention for exceptions propagation. 

Fig. 3. Operational Semantics for Mini-ML^f 
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Remark 4- Values at level 0 can be classified according to the five kinds of types: 



types 


Tl -f T2 


nat 


(r) 




ref a 


values 


Xx.e 


z, s 




[P>] 


1 



Because of (v^) the definition of value at level 0 involves values at higher levels. 
Values at level > 0, called symbolic values, are almost like terms. The differences 
between the BNF for and E is in the productions for (e) and ~e: 

— is a value at level n, rather than level n -I- 1 

_ ~yri+i jg value at level n -I- 2, rather than level n -I- 1. 



Note 2. We will use the following auxiliary notation to describe stores: 

~ /i is value store fj,: L ^ V°; 

— S \= ^ /i is a value store and dom{S) = dom{fj,) and 27; 0 h fj,{l):a^ 
whenever I € dom{fj,). 

The following result establishes basic facts about the operational semantics, 
which are independent of the type system. 

Lemma 3 (Values), /x, e A ii',v implies dom{p) C dom{fj,') and FY{fi',v) C 
FV(/x, e); moreover, if fi is a value store, then u G V” and p! is a value store. 

Proof. By induction on the derivation of the evaluation judgement fi,e p! ,v. 

The following property justifies why a cr G C is called a closed type. 

Lemma 4 (Closedness). 27, h cr° implies FV(u°) = 0. 

Proof. By induction on the derivation of 27, A~^^; P^^ h v^: cr°. 

Remark 5. Let V,- = {u G V°|27, h u:t°} be the set of values of 

type T (in a given context 27, T+^). It is easy to show that the mapping 

[u] I— V is an injection of into V,-, and moreover it is represented by the term 

open = Act. ( let [x] = x in x), i.e. open: [r] — >■ r and open [u] v. 

Note also that the Closedness Lemma implies the mapping [u] i— >■ u is a bijection 
when r is a closed type. A posteriori, this property justifies the typing rule 
(close*), which in turn ensures that term close = Xx.[x], representing the inverse 
mapping u i— >■ [u], has type cr — >■ [a]. 

Evaluation of Run at level 0 requires to view a value at level 1 as a term to be 
evaluated at level 0. The following result says that this confusion in the levels is 
compatible with the type system. 

Lemma 5 (Demotion). 27, Z\+^; h u"+^: implies 27, Z\; T h r". 
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datatype nat 

val p = fn : nat -> real -> real ref -> unit 

val p_a = fn : nat -> <real> -> <real ref> -> <unit> 

val p_cg = fn : nat -> <real -> real ref -> unit> 

val p_sc = <fn X y => (y:=1.0; y:=x*!y; y:=x*!y; y:=x*!y)> 

: <real -> real ref -> unit> 

> val p_sp = run[p_sc]; (* specialised program *) 

val p_sp = fn : real -> real ref -> unit 

> val p_pg = fn n => let [n] = [n] in run[p_cg n] ; (* program generator *) 

val p_pg = fn : nat -> real -> real ref -> unit 



Fig. 4. The Example Written in Mini-ML®f 



Proof. By induction on the derivation of S, h 

To fully claim the reflective nature of Mini-ML®f we need also a Promotion 
Lemma (which, however, is not relevant to the proof of Type Safety). 

Lemma 6. S, A;P \- e: r" implies e G and S, h e: 

Finally, we establish the key result relating the type system to the operational 
semantics. This result entails that evaluation of a well-typed program 0; 0 h e: 
cannot raise an error, i.e. 0,e ^ err is not derivable. 

Theorem 1 (Safety), /i, e A d and S \= p, and i7, h e:r” imply 

that there exist p' and u" and S' such that d = (/i',u”) and S,S' ^ p' and 
S,S',A+^-,r+'^ h v”:r”. 

Proof. By induction on the derivation of the evaluation judgement p,e ^ d. 

2.3 The Power Function 

While ensuring the safety of Mini-ML^f requires a relatively non-trivial type 
system, the power examples presented at the beginning of this paper can still 
be expressed just as concisely as in MetaML. First, we introduce the following 
top-level derived forms: 

— val X = e; p stands for (let [x] = [e] in p), with the following derived rules 
for typing and evaluation at level 0 

if, 0 h e: Ti" p,eAp',u 

if, Z\, x: rf ; P h p: T 2 ” p',p[x: = u] A p" , v' 

^,Z\;Th(val x = e; pfr^" (^al x = e; p)^p",v' 

— a top-level definition by pattern-matching is reduced to one of the form 
val f = e ; p in the usual way (that is, using the case and fix constructs) . 
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Note that this means identifiers declared at top-level go in the closed part A 
of a context E, A; F. We assume to have a predefined closed type real with a 
function times real — real real and a constant 1.0: real. Figure 0 reconsider 
the example of Figure^used in the introduction in Mini-ML^f : 

— the declarations of p, p_a, p_cg and p_sc do not require any change; 

— in the declaration of p_sp one closedness annotation has been added; 

— p_pg is a program generator with the same type of the conventional program 
p, but applied to a natural, say 3, returns a specialised program (i.e. p_sp). 



3 Related Work 

The problem we identify at the beginning of this paper also applies to Davies’s 

|Dav9fij . which allows open code and symbolic evaluation under lambda (but 
has no construct for running code). Therefore, the naive addition of references 
leads to the same problem of scope extrusion pointed out in the Introduction. 

Mini-ML^f is related to Binding-Time Analyses (BTAs) for imperative lan- 
guages. Intuitively, a BTA takes a single-stage program and produces a two-stage 
one (often in the form of a two-level program) p,KIS93iTah()0| . Thiemann and 
Dussart im describe an off-line partial evaluator for a higher-order language 
with first-class references, where a two-level language with regions is used to 
specify a BTA. Their two-level language allows storing dynamic values in static 
cells, but the type and effect system prohibits operating on static cells within 
the scope of a dynamic lambda (unless these cells belong to a region local to 
the body of the dynamic lambda). While both this BTA and our type system 
ensure that no run-time error (such as scope extrusion) can occur, they provide 
incomparable extensions. 

Hatcliff and Danvy propose a partial evaluator for a computational 

metalanguage, and they formalise existing techniques in a uniform framework by 
abstracting from dynamic computational effects. However, this partial evaluator 
does not seem to allow interesting computational effects at specialisation time. 
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Abstract. We describe SAFL, a call-by-value first-order functional lan- 
guage which is syntactically restricted so that storage may be statically 
allocated to hxed locations. Evaluation of independent sub-expressions 
happens in parallel — we use locking techniques to protect shared-use 
function dehnitions (i.e. to prevent unrestricted parallel accesses to their 
storage locations for argument and return values). SAFL programs have 
a well dehned notion of total (program and data) size which we refer 
to as ‘area’; similarly we can talk about execution ‘time’. Fold/unfold 
transformations on SAFL provide mappings between different points on 
the area-time spectrum. The space of functions expressible in SAFL is 
incomparable with the space of primitive recursive functions, in partic- 
ular interpreters are expressible. The motivation behind SAFL is hard- 
ware description and synthesis — we have built an optimising compiler 
for translating SAFL to silicon. 



1 Introduction 

This paper addresses the idea of a functional language, SAFL, which 

— can be statically allocated — all variables are allocated to fixed storage loca- 
tions at compile time — there is no stack or heap; and 

— has independent sub-expressions evaluated concurrently. 

While this concept might seem rather odd in terms of the capabilities of modern 
processor instruction sets, our view is that it neatly abstracts the primitives 
available to a hardware designer. Our desire for static allocation is motivated 
by the observation that dynamically-allocated storage does not map well onto 
silicon: an addressable global store leads to a von Neumann bottleneck which 
inhibits the natural parallelism of a circuit. SAFL has a call- by- value semantics 
since strict evaluation naturally facilitates parallel execution which is well suited 
to hardware implementation. 



U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 37-^^ 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 
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To emphasise the hardware connection we define the area of a SAFL program 
to be the total space required for its execution. Due to static allocation we see 
that area is 0{length of program); similarly we can talk about execution time. 
Fold/unfold transformations P at the SAFL level correspond directly to area- 
time tradeoffs at the hardware level. 

In this paper we are concerned with the properties of the SAFL language 
itself rather than the details of its translation to hardware. In the light of this 
and for the sake of clarity, we present an implementation of SAFL by means of 
a translation to an abstract machine code which we claim mirrors the primitives 
available in hardware. The design of an optimising compiler which translates 
SAFL into hardware is presented in a companion paper HH. A more practical 
use of SAFL for hardware/software co-design is given in 0. 

The body of this paper is structured as follows. Section Eldescribes the SAFL 
language and Section 0 describes an implementation on a parallel abstract ma- 
chine. In Sections 0 and 0 we argue that SAFL is well suited for hardware 
description and synthesis. Section 0 shows how fold/unfold transformations can 
represent SAFL area-time tradeoffs. Finally, Sections Q and 0 discuss more theo- 
retical issues: how SAFL relates to Primitive Recursive functions and problems 
concerning higher-order extensions. Section 0concludes and outlines some future 
directions. 



Comparison with Other Work 

The motivation for static allocation is not new. Gomard and Sestoft 0 describe 
glohalization which detects when stack or heap allocation of function parameters 
can be implemented more efficiently with global variables. However, whereas 
globalization is an optimisation which may in some circumstances improve per- 
formance, in our work static allocation is a fundamental property of SAFL en- 
forced by the syntactic restrictions described in Section 0 

Previous work on compiling declarative specifications to hardware has cen- 
tred on how functional languages themselves can be used as tools to aid the 
design of circuits. Sheeran’s muFP m and Lava systems use functional pro- 
gramming techniques (such as higher order functions) to express concisely the 
repeating structures that often appear in hardware circuits. In this framework, 
using different interpretations of primitive functions corresponds to various op- 
erations including behavioural simulation and netlist generation. Our approach 
takes SAFL constructs (rather than gates) as primitive. Although this restricts 
the class of circuits we can describe to those which satisfy certain high-level prop- 
erties, it permits high-level analysis and optimisation yielding efficient hardware. 
We believe our association of function definitions with hardware resources (see 
Section EJ to be novel. 

Various authors have described silicon compilers (e.g. for C 0 and Oc- 
cam [TOjl. Although rather beyond the scope of this paper, we argue that the 
flexibility of functional languages provides much more scope for analysis and 
optimisation. 
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Hofmann ^ describes a type system which allows space pre-allocated for 
argument data-structures to be re-used by in-place update. Boundedness there 
means that no new heap space is allocated although stack space may be un- 
bounded. As such our notion of static allocatability is rather stronger. 

2 Formalism 

We use a first-order language of recursion equations (higher order features are 
briefly discussed in Section 0). Let c range over a set of constants, x over vari- 
ables (occurring in let declarations or as formal parameters), a over primitive 
functions (such as addition) and / over user-defined functions. For typograph- 
ical convenience we abbreviate formal parameter lists {xi, . . . ,Xk) and actual 
parameter lists (ei, . . . , e^) to x and e respectively; the same abbreviations are 
used in let definitions. SAFL has syntax of: 

— terms e given by: 

e ::= c | a; | if e\ then 62 else 63 | let x = e in eg | 
a(ei,..., 

^arity(a)^ I ■ 5 Parity (f)') 

— programs p given by: 

p ::= fun f^^{x) = en and . . . and (x) = ei^ 

fun /"^(i) = e„i and . . . and (x) = e„r„- 
We refer to a phrase of the form 

fun P^{x) = €ii and . . . and f"'{x) = 

as a (mutually recursive) function group. The notation just means the jth 
function of group i. Programs have a distinguished function main (normally ) 
which represents an external world interface — at the hardware level it accepts 
values on an input port and may later produce a value on an output port. 

To simplify semantic descriptions we will further assume that all function 
and variable names are distinct; this is particularly useful for static allocation 
since we can use the name of the variable for the storage location to which it is 
allocated. 

We impose additional stratification restriction^ on the Cij occurring as bod- 
ies of the P^', arbitrary calls to previous definitions are allowed, but recursion 
(possibly mutual) is restricted to tail recursion to enforce static allocatability. 
This is formalised as a well-formedness check. Define the tailcall contexts, PC hy 

PC ::= [ ] I if 6i then 62 else PC \ if ei then PC else 63 
I let a: = e in PC 



^ Compare this with stratified negation in the deductive database world. 
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The well-formedness condition is then that, for every user-function application 
within function definition f^^{x) = Cgk in group g, we have that: 

i<gW{i = gA3{Ce TC).egk = C[f\(^]) 

The first part {i < g) is merely static scoping (definitions are in scope if previ- 
ously declared) while the second part says that a call to a function in the same 
group (i) as its definition {g) is only valid if the call is actually in tailcall context. 

3 Implementing SAFL 

We give a translation | • ] of SAFL programs into an abstract machine code 
which mirrors primitives available in hardware (its correctness relies on SAFL 
restrictions). Each function definition corresponds to one block of code. In order 
to have temporaries available we will assume that each expression and sub- 
expression is labelled by a unique label number (or ‘occurrence’) from which a 
storage location can be generated. Label names are assumed to be distinct from 
variables so we can use the notation and M( to mean the storage location 
associated with variable x or label i respectively. We use the notation ^ : e to 
indicate expression e has label £. The expression 

if x=l then y else f(x,y-l) 

might then be more fully written as (temporarily using the notation instead 
of £ : e used elsewhere): 

(if = y£3 

We write / . f ormals to stand for (Mr^ , . . . , Mx^. ) where x is the tuple of formal 
parameters of / (which are already assumed to be globally distinct from all other 
variable names). Similarly, we will assume all functions in group i leave their 
result in the storage location Mji. result — this this is necessary to ensure tailcalls 
to other functions in group i behave as intended^ (The notation i* in general 
refers to a common resource shared by the members of function group i.) 

In addition to the storage location as above, we need two other forms of 
storage: for each function group i we have a location to store the return 
link (accessed by JSR and RET)-, and a semaphore S'i, to protect its (statically 
allocated) arguments, temporaries and the like from calls in competing PAR 
threads by enforcing mutual exclusion. 

The abstract instructions for the machine are as follows: 



^ A type system would require that the result type of all functions in a group are 
identical — because they return each others’ values — so A/yt. result has a well-defined 
size. 
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TO := to' Copy to' to to. 

(toi, . . . , TOfe) := {m'l, . . . m'f,) Copy the to' to the rrii in any order. 

TO := PRIMOP aitrii, . . . m,k) Perform the operation corresponding to built-in 

primitive a. 

LOCK{S) Lock semaphore S. 

UNLOCK {S) Release semaphore S. 

JMP{ep) Branch to function entrypoint ep — used for tail- 

call. 

JSR{m, ep) Store some representation of the point of call in 

location to and branch to function entrypoint ep. 
RET{m) Branch to the instruction following the point of 

call specified by to. 

COND{m, seQi, seq2) If location to holds a ‘true’ value then execute 

opcode sequence seq^ otherwise seq2- 
PAR{seqi , . . . , seq/.) Execute opcode sequences seq ^, . . . , seg^. in paral- 
lel, waiting for all to complete before terminating. 

Instructions are executed sequentially, except that JSR, JMP and RET alter the 
execution sequence. The PAR construct represents fork-join parallelism (each of 
the operand sequences are executed) and COND the usual conditional (one of 
the two operand sequences is executed). 

Assuming e is a sub-expression of a function body of group g the compilation 
function |e]®TO gives an opcode sequence which evaluates e to storage location to 
( we omit g for readability in the following — it is only used to identify tailcalls): 

|c]to = m := c 
\ x\m = TO := Mx 

|if {(. : Cl) then 62 else e^\m = 

COND{Me, |e 2 ]TO, {e^lm) 

|let (xi, . . . jXfc) = (ei, . . . ,Cfc) in eo]TO = 
PAR([eilM,,,...,[efelM,J; 

IcoIto 

[a{£i : Cl, ... ,4 : efc)]TO = 

PAR([eilM, ,,..., [eJM.J; 
m:= PRIMOPa{Mi,,...,Mt,) 

: Cl, ... ,4 : ek)jm = 

'PARdeilM, ' 

LOCK{Su); 

M fij • ■ • ; -r ^ „ 

JSR{L,^,EntryPt,j)-, pi « ^ 

TO . .result 5 

^ UNLOCK(Si^) 

: ei ,...,4 : efc)]TO = 
pARdeilM, [efelM.J; | 

\ .f ormals ■ / if i g 

[ JMP{EntryPt^j) J 
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Finally we need to compile each function definition to a sequence of instruc- 
tions labelled with the function name: 



The above translation is naive (often semaphores and temporary storage loca- 
tions are used unnecessarily) but explains various aspects; an optimising compiler 
for hardware purposes has been written mi- 

proposition 1. The translation |e] correctly implements SAFL programs in 
that executing the abstract machine code coincides with standard eager evaluation 



Note that the translation would fail if we use one semaphore-per-function instead 
of one-per-group. Consider the program 

fun f (x) = if x=0 then 1 else g(x-l) 
and g(x) = if x=0 then 2 else f(x-l); 
fun h(x,y) = x+y; 
fun mainO = h(f (8) ,g(9) ) ; 

where there is then the risk that the PAR construct for the actual arguments 
to h will simultaneously take locks on the semaphores for f and g resulting in 
deadlock. 

4 Hardware Synthesis Using SAFL 

As part of the FLaSH project (Functional Languages for Synthesising Hardware) 
im, we have implemented an optimising silicon compiler which translates SAFL 
specifications into structural Verilog. We have found that SAFL is able to express 
a wide range of hardware designs; our tools have been used to build a small 
commercial processor H 

The static allocation properties of SAFL allow our compiler to enforce a 
direct mapping between a function definition: 



and a hardware block, Pdf, with output port, P/, consisting of: 

— a fixed amount of storage (registers holding values of the arguments x) and 

— a circuit to compute e to Pf. 

Hence, multiple calls to a function / at the source level corresponds directly 
to sharing the resource Hf at the hardware level. As the FLaSH compiler syn- 
thesises multi-threaded hardware, we have to be careful to ensure that multiple 

® We implemented the instruction set of the Cambridge Consultants XAP processor: 
http://www.camcon.co.uk; we did not support the SIF instruction. 




o/e. 
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accesses to a shared hardware resource will not occur simultaneously. We say that 
resource, Hf, is subject to a sharing conflict if multiple accesses may occur con- 
currently. Sharing conflicts are dealt with by inserting arbiters (cf. semaphores 
in our software translation) . Program analysis is used to detect potential sharing 
conflicts — arbiters are only synthesised where necessary. 

Our practical experience of using the FLaSH system to design and build real 
hardware has brought to light many interesting techniques that conventional 
hardware description languages cannot exploit. These are outlined below. 

4.1 Automatic Generation of Parallel Hardware 

Hammond jS] observes: 

“It is almost embarrassingly easy to partition a program written in a 
strict [functional] language [into parallel threads]. Unfortunately, the 
partition that results often yields a large number of very fine-grained 
tasks.” 

He uses the word unfortunately because his discussion takes place in the context 
of software, where fairly course-grained parallelism is required to ensure the 
overhead of fork/ join does not outweigh the benefits of parallel evaluation. 

In contrast, we consider the existence of “a large number of very fine-grained 
tasks” to be a very fortunate occurrence: in a silicon implementation, very fine- 
grained parallelism is provided with virtually no overhead! The FLaSH compiler 
produces hardware where all function arguments and let-declarations are eval- 
uated in parallel. 

4.2 Source-Level Program Transformation 

We have found that source-level program transformation of SAFL specifications 
is a powerful technique. A designer can explore a wide range of hardware imple- 
mentations by repeatedly transforming an initial specification. 

We have investigated a number of transformations which correspond to con- 
cepts in hardware design. Due to space constraints we can only list the transfor- 
mations here: 

Resource Sharing vs Duplication: Since a single user-defined function cor- 
responds to a single hardware block SAFL provides fine-grained control over 
resource sharing/duplication. 

Static vs Dynamic Scheduling: By default, function arguments are evalu- 
ated in parallel. Thus compiling f(4)+f(5) will generate an arbiter to se- 
quentialise access to the shared resource Hf. Alternatively we can use a 
let-declaration to specify an ordering statically. The circuit corresponding 
to let x=f (4) in x+f (5) does not require dynamic arbitration; we have 
specified a static order of access to Hf . 

Area-Time Tradeoffs: We observe that fold/unfold transformations corres- 
pond directly to area-time tradeoffs at the hardware level. This can be seen 
as a generalisation of resource sharing/duplication (see Section E|). 
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Hardware- Software Partitioning: We have demonstrated ^ a source-source 
transformation which allows us to represent hardware/software partitioning 
within SAFL. 

At the SAFL level it is relatively straightforward to apply these transformations. 
Investigating the same tradeoffs entirely within RTL Verilog would require time- 
consuming and error-prone modifications throughout the code. 

4.3 Static Analysis and Optimisation 

Declarative languages are more susceptible to analysis and transformation than 
imperative languages. In order to generate efficient hardware, the FLaSH com- 
piler performs the following high-level analysis techniques (documented in [ I I j 1: 

Parallel Conflict Analysis is performed at the abstract syntax level, return- 
ing a set of function calls which require arbitration at the hardware level. 
Register Placement is the process of inserting temporary storage registers 
into a circuit. The FLaSH compiler translates specifications into intermedi- 
ate code (based on control/data flow graphs) and performs data-flow analysis 
at this level in order to place registers. (This optimisation is analogous to 
minimising the profligate use of Mi temporaries seen in the software trans- 
lation I ■ ].) 

Timing Analysis (with respect to a particular implementation strategy) is 
performed through an abstract interpretation where functions and operators 
return the times taken to compute their results. 

4.4 Implementation Independence 

The high level of specification that SAFL provides means that our hardware de- 
scriptions are implementation independent. Although the current FLaSH com- 
piler synthesises hardware in a particular style 3 there is the potential to develop 
a variety of back-ends, targeting a wide range of hardware implementations. 

In particular, we believe that SAFL would lend itself to asynchronous cir- 
cuit design as the compositional properties of functions map directly onto the 
compositional properties of asynchronous hardware modules. We plan to de- 
sign an asynchronous back-end for FLaSH in order to compare synchronous and 
asynchronous implementations. 

5 A Hardware Example 

In order to provide a concrete example of the benefits of designing hardware in 
SAFL consider the following specification of a shift-add multiplier: 

fun multCx, y, acc) = 
if (x=0 I y=0) then acc 

else mult(x<<l, y>>l, if y.bitO then acc+x else acc) 

^ The generated hardware is synchronous with bundled data and ready signals. 
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From this specification, the FLaSH compiler generates a hardware resource, 
Hmuit, with three data-inputs: (x, y and acc), a data-output (for returning the 
function result), a control-input to trigger computation and a control-output 
which signals completion. The multiplier circuit contains some control logic, two 
1-place shifters, an adder and three registers which are used to latch data-inputs. 
We trigger F/muit by placing argument values on the data-inputs and signalling 
an event on the control-input. The result can be read from the data-output when 
a completion event is signalled on the control-output. 

These 3 lines of SAFL produce over 150 lines of RTL Verilog. Synthesising a 
16-bit version of mult, using Mentor Graphics’ Leonardo tool, yields 1146 2-input 
equivalent gatesH Implementing the same algorithm directly in RTL Verilog took 
longer to write and yielded an almost identical gate count. 

6 Fold/Unfold for Area-Time Tradeoff 

In section lO we observed that the fold/unfold transformation P can be used 
to trade area for time. As an example of this consider: 

fun f X = ... 

fun main(x,y) = g(f(x),f(y)) 

The two calls to f are serialised by mutual exclusion before g is called. Now use 
fold/unfold to duplicate f as f ’, replacing the second call to f with one to f ’. 
This can be done using an unfold, a definition rule and a fold yielding 

fun f X = . . . 
fun f ’ X = ... 

fun main(x,y) = g(f (x) ,f ’ (y) ) 

The second program has more area than the original (by the size of f ) but runs 
more quickly because the calls to f (x) and f ’ (y) execute in parallel. 

Although the example given above is trivial, we find fold/unfold to be a use- 
ful technique in choosing a hardware implementation of a given specification. 
Note that fold/unfold allows us to do more than resource/duplication sharing 
tradeoffs. For example, folding/unfolding recursive function calls before compil- 
ing to synchronous hardware corresponds to trading the amount of work done 
per clock cycle against clock speed — mult can be mechanically transformed into: 

fun multCx, y, acc) = 
if (x=0 I y=0) then acc 
else let (x’,y’,acc’) = (x<<l, y»l, 

if y.bitO then acc+x else acc) in 
if (x’=0 I y’=0) then acc’ 

else mult(x’<<l, y’>>l, if y’ .bitO then acc’+x’ else acc’) 

which takes half as many clock cycles. 

This figure includes the gates required for the three argument registers. 
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7 Theoretical Expressibility 

Here we consider the expressibility of programs in SAFL. Clearly in one sense, 
each such program represents a finite state machine as it has a bounded number 
of states and memory locations, and therefore is very inexpressive (but this 
argument catches the essence of the problem no more than observing that a 
personal computer is also a finite state automaton) . 

Consider the relation to Primitive Recursive (PR) functions. In the PR def- 
inition scheme, suppose g and h are already shown to be primitive recursive 
functions (of k and A: -|- 1 arguments), then the definition of / 

f{0,xi, ...,Xk)= g{xi,. ..,Xk) 
f{n+l,Xi,...,Xk) = h{f{n,Xi,...,Xk),Xi,...,Xk) 

is also primitive recursive of fc-|-l arguments.0 We see that the SAFL restrictions 
require h to be the identity projection on its first argument, but that our defini- 
tional scheme allows recursion more general than descent through a well-founded 
order {n + 1 via n eventually to 0 in the integer form above). In particular SAFL 
functions may be partial. 

Thus we conclude that, in practice, our statically allocatable functions rep- 
resent an incomparable subset of general recursion than that subset specified by 
primitive recursion. Note that we cannot use translation to continuation-passing 
form where all calls are tailcalls because SAFL is first order (but see the next 
section). Jones |Z] shows the subtle variance of expressive power on recursion 
forms, assignability and higher-order types. 

As a slightly implausible aside, suppose we consider statically allocatable 
functional languages where values can range over any natural number. In this 
case the divergence from primitive recursion becomes even clearer — even if we 
have an assertion that the statically allocated functional program is total then we 
cannot in general transform it into primitive recursive form. To see this observe 
that we can code a register machine interpreter as such a statically allocated 
program with register machine program being Ackermann’s function. 

8 Higher Order Extensions to SAFL 

Clearly a simple addition of higher-order functions to SAFL would break static 
alloc at ability by allowing recursion other than in the program structure. Con- 
sider for example the traditional 

let g(n,h) = if n=0 then 1 else n * h(n-l,h) 
let f(n) = g(n,g) 

trick to encode the factorial function in a form which requires 0{n) space0 

® In practice we widen this definition to allow additional intensional forms without 
affecting the space of functions definable. 

^ Of course one could use an accumulator argument to implement this in 0(1) space, 
but we want the statically allocatability rules to be intensional. 
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A simple extension is to allow functions to be passed as values, and function 
valued expressions to be invoked. (No closures can be constructed because of 
the recursion equation syntax). One implementation of this is as follows: use a 
control-flow analysis such as 0-CFA to identify possible targets of each indirect 
call (i.e. call to a function- valued expression). This can be done in cubic time. We 
then fault any indirect calls which are incompatible with the function definition 
hierarchy in the original program — i.e. those with a function /*-' containing a 
possible indirect call to where g > i (unless the indirect call is in tailcall 
context and g = i). Information from 0-CFA also permits a source-to-source 
transformation to first-order using a case branch over the possible functions 
callable at that point: we map the call e' {e) where e' can evaluate to g, h or i 
into: 

if e'=g then g(e) else if e'=h then h(e) else i(e) 

The problem with adding nested function definitions (or A-expressions) is 
that it is problematic to statically allocate storage for the resulting closures. 
Even programs which use only tail-recursion and might at first sight appear 
harmless, such as 

fun r(x) = (some function depending on x) 

fun f(x,g) = if x=0 then g(0) 

else f(x-l, r(x) o g) 

require unlimited store. Similarly, the translation to CPS (continuation passing 
style) transforms any program into an equivalent one using only tailcalls, but at 
the cost of increasing it to higher-order — again see [Z| for more details. 

One restriction which allows function closures is the Algol solution: functions 
can be passed as parameters but not returned as results (or at least not beyond 
the scope any of their free variables) . It is well known that such functions can be 
stack implemented, which in the SAFE world means their storage is bounded. 
Of course we still need the 0-CFA check as detailed above. 



9 Conclusions and Further Work 

This paper introduces the idea of statically allocated functional languages which 
are interesting in themselves as well as being apposite and powerful for expressing 
hardware designs. However there remains much to be done to explore their uses 
as hardware synthesis languages, e.g. optimising hardware compilation, type 
systems, synchronous versus asynchronous translations, etc. 

Currently programs support a ‘start and wait for result’ interface. We realise 
that in real hardware systems we need to interact with other devices having 
internal state. We are considering transactional models for such interfaces in- 
cluding the use of channels. Forms of functional language input/output explored 
in Gordon’s thesis |3] may be also be useful. 
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Abstract. We establish that the algorithmic complexity of the min- 
imum spanning tree problem is equal to its decision-tree complexity. 
Specifically, we present a deterministic algorithm to find a minimum 
spanning forest of a graph with n vertices and m edges that runs in time 
0{T* (m,n)) where T* is the minimum number of edge-weight compar- 
isons needed to determine the solution. The algorithm is quite simple 
and can be implemented on a pointer machine. 

Although our time bound is optimal, the exact function describing it 
is not known at present. The current best bounds known for T* are 
T*{m,n) = Q(m) and T*{m,n) = 0(m • a{m,n)), where a is a certain 
natural inverse of Ackermann’s function. 

Even under the assumption that T* is super-linear, we show that if the in- 
put graph is selected from G„^m, our algorithm runs in linear time w.h.p., 
regardless of n, m, or the permutation of edge weights. The analysis uses 
a new martingale for Gn,m similar to the edge-exposure martingale for 

Gji^p. 

Keywords: Graph algorithms; minimum spanning tree; optimal com- 
plexity. 



1 Introduction 

The minimum spanning tree (MST) problem has been studied for much of this 
century and yet despite its apparent simplicity, the problem is still not fully un- 
derstood. Graham and Hell give an excellent survey of results from the 

earliest known algorithm of Boruvka to the invention of Fibonacci heaps, 

which were central to the algorithms in jFTM7IGGMTi^b| . Chazelle fGhaz97] pre- 
sented an MST algorithm based on the Soft Heap jCha.zDSj having complexity 
0(ma(m, n) log a(m, n)), where a is a certain inverse of Ackermann’s function. 
Recently Chazelle jCha,z99j modified the algorithm in |Cha,z97j to bring down 
the running time to 0{m ■ a(m,n)). Later, and in independent work, a similar 
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algorithm of the same running time was presented in Pettie jPet00| . which gives 
an alternate exposition of the 0{m ■ a{m,n)) result. This is the tightest time 
bound for the MST problem to date, though not known to be optimal. 

All algorithms mentioned above work on a pointer machine nwn under 
the restriction that edge weights may only be subjected to binary comparisons. 
If a more powerful model is assumed, the MST can be computed optimally. 
Fredman and Willard iFW??ni showed that on a unit-cost RAM where the bit- 
representation of edge weights may be manipulated, the MST can be computed 
in linear time. Karger et al. lIRKTy,’^! presented a randomized MST algorithm 
that runs in linear time with high probability, even if edge weights are only 
subject to comparisons. 

It is still unknown whether these more powerful models are necessary to 
compute the MST in linear time. However, we give a deterministic, comparison- 
based MST algorithm that runs on a pointer machine in 0{'T*{m,n)) time, 
where T*{rn,n) is the number of edge- weight comparisons needed to determine 
the MST on any graph with m edges and n vertices. Additionally, by considering 
our algorithm’s performance on random graphs, we show that it runs in linear 
time w.h.p., regardless of edge-density or the permutation of edge weights. 

Although our algorithm is optimal, its precise running time is not known at 
this time. In view of recent results we can state that the running time of our 
algorithm is 0(m ■ a(m, n)). Clearly, its running time is also 

In the next section we review some well-known MST results that are used 
by our algorithm. In section 0 we prove a key lemma and give a procedure for 
partitioning the graph in an MST-respecting manner. Section0gives an overview 
of the optimal algorithm. Section0gives the algorithm and a proof of optimality. 
In section0we show our algorithm runs in linear-time w.h.p. if the input graph 
is selected at random. Sections Q & 0 discuss related problems, open questions, 
and the actual complexity of MST. 

2 Preliminaries 

The input is an undirected graph G = (V,E) where each edge is assigned a 
distinct real- valued weight. The minimum spanning forest (MSF) problem asks 
for a spanning acyclic subgraph of G having the least total weight. Throughout 
the paper m and n denote the number of edges and vertices in the graph. 

It is well-known that one can identify edges provably in the MSF using the 
cut property, and edges provably not in the MSF using the cycle property. The 
cut property states that the lightest edge crossing any partition of the vertex 
set into two parts must belong to the MSF. The cycle property states that the 
heaviest edge in any cycle in the graph cannot be in the MSF. 

2.1 Boruvka Steps 

The earliest known MSF algorithm is due to Boruvka fBor26|. It proceeds in a 
sequence of stages, and in each stage it executes a Boruvka step on the graph 
G, which identifies the set F consisting of the minimum-weight edge incident 
on each vertex in G, includes these edges to the MSF (by the cut property), 
and then forms the graph Gi = G\F as the input to the next stage, where 
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G\F is the graph obtained by contracting each connected component formed 
by F. This computation can be performed in linear time. Since the number of 
vertices reduces by at least a factor of two, the running time of this algorithm 
is 0(m log n). 

Our optimal algorithm uses a procedure called Boruvka2(G; F, G'). This pro- 
cedure executes two Boruvka steps on the input graph G and returns the con- 
tracted graph G' as well as the set of edges F identified for the MSF during 
these two steps. 

2.2 Dijsktra-Jarmk-Prim Algorithm 

Another early MSF algorithm that runs in 0(m log n) time is the one by Jarnik 
|Jar30| . re-discovered by Dijkstra |Uij59| and Prim [Prim m- We will refer to 
this algorithm as the DJP algorithm. Briefiy, the DJP algorithm grows a tree 
T, which initially consists of an arbitrary vertex, one edge at a time, choosing 
the next edge by the following simple criterion: Augment T with the minimum 
weight edge (x, y) such that x € T and y ^ T. By the cut property, all edges in 
T are in the MSF. We omit the proof of the following simple lemma. 

Lemma 1. LetT be the tree formed after the execution of some number of steps 
of the DJP algorithm. Let e and f be two arbitrary edges, each with exactly one 
endpoint in T, and let g be the maximum weight edge on the path from e to f in 
T. Then g cannot be heavier than both e and f . 

2.3 The Dense Case Algorithm 

The procedure DenseCase(G; F) takes as input an ni-node, mi-edge graph G, 
where mi < m and ni < n/log^^^ n, and returns the MSF F of G. It is not 
difficult to see that the algorithms presented in IF'T87IGGST86IChaz97IChaz99l , 
IPet99l will find the MSF of G in 0(n -I- m) time. 

2.4 Soft Heap 

The main data structure used by our algorithm is the Soft Heap |Chaz98j| . The 
Soft Heap is a kind of priority queue that gives us an optimal tradeoff between 
accuracy and speed. It supports the following operations: 

• MakeHeap(): returns an empty soft heap. 

• Insert(S', a;): insert item x into heap S. 

• Findmin(S'): returns item with smallest key in heap S. 

• Delete(S', x): delete x from heap S. 

• Meld(S'i, 82 ). create new heap containing the union of items stored in Si 

and S 2 , destroying Si and S 2 in the process. 

All operations take constant amortized time, except for Insert, which takes 
0(log(i)) time. To save time the Soft Heap allows items to be grouped together 
and treated as though they have a single key. An item adopts the largest key of 
any item in its group, corrupting the item if its new key differs from its original 
key. Thus the original key of an item returned by Findmin (i.e. any item in the 
group with minimum key) is no more than the keys of all uncorrupted items 
in the heap. The guarantee is that after n Insert operations, no more than en 
corrupted items are in the heap. The following result is shown in jCha,7,98j . 
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Lemma 2. Fix any parameter 0 < e < 1/2, and beginning with no prior data, 
eonsider a mixed sequence of operations that includes n inserts. On a Soft Heap 
the amortized complexity of each operation is constant, except for insert, which 
takes 0(log(l/e)) time. At most en items are corrupted at any given time. 

3 A Key Lemma and Procedure 

3.1 A Robust Contraction Lemma 

It is well known that if T is a tree of MSF edges, we can contract T into a single 
vertex while maintaining the invariant that the MSF of the contracted graph 
plus T gives the MSF for the graph before contraction. 

In our algorithm we will find a tree of MSF edges T in a corrupted graph, 
where some of the edge weights have been increased due to the use of a Soft Heap. 
In the lemma given below we show that useful information can be obtained by 
contracting certain corrupted trees, in particular those constructed using some 
number of steps from the Dijkstra-Jarnik-Prim (DJP) algorithm. Ideas similar 
to these are used in Chazelle’s 1997 algorithm EEiiSa, and more explicitly in 
the recent algorithms of Pettie |Pet99 | and Chazelle IDB. 

Before stating the lemma, we need some notation and preliminary concepts. 
Let V{G) and E{G) be the vertex and edge sets of G. Let the G-weight of an 
edge be its weight in graph G (the G may be omitted if implied from context). 

For the following definitions, M and G are subgraphs of G. Denote by G fi M 
a graph derived from G by raising the weight of each edge in M by arbitrary 
amounts (these edges are said to be corrupted). Let Mq be the set of edges in 
M with exactly one endpoint in G. Let G\G denote the graph obtained by con- 
tracting all connected components induced by G, i.e. by replacing each connected 
component with a single vertex and reassigning edge endpoints appropriately. 

We define a subgraph G of G to be DJP- contractible if after executing the 
DJP algorithm on G for some number of steps, with a suitable start vertex in 
G, the tree that results is a spanning tree for G. 

Lemma 3. Let M be a set of edges in a graph G. If C is a subgraph of G 
that is DJP- contractible w.r.t. G '])' M , then MSF{G) is a subset of MSF{G) U 
MSF{G\G - Me) A Me. 

Proof. Each edge in G that is not in MSF(G) is the heaviest edge on some cycle 
in G. Since that cycle exists in G as well, that edge is not in MSF(G). So we 
need only show that edges in G\G that are not in MSF(G\G — Me) U Me are 
also not in MSF(G). 

Let H = G\G — Ale', hence we need to show that no edge in H — MSF{H) 
is in MSF{G). Let e be the heaviest edge on some cycle % in (i.e. e G 
H — MSF{H)). If X does not involve the vertex derived by contracting G, then 
it exists in G as well and e ^ MSF{G). Otherwise, x forms a path V in G 
whose end points, say x and y, are both in G. Let the end edges of V be {x, w) 
and {y,z). Since H included no corrupted edges with one end point in G, the 
G-weight of these edges is the same as their (G f]' M)-weight. 
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Let T be the spanning tree of C "fl" derived by the DJP algorithm, Q be 
the path in T connecting x and y, and g be the heaviest edge in Q. Notice that 
VU Q forms a cycle. By our choice of e, it must be heavier than both {x, y) 
and {w, z), and by Lemma ^ the heavier of (x, y) and {w, z) is heavier than the 
(G -fl" M)-weight of g, which is an upper bound on the G-weights of all edges in 
Q. So w.r.t. G-weights, e is the heaviest edge on the cycle V Q and cannot be 
in MSF{G). 

3.2 The Partition Procedure 

Our algorithm uses the Partition procedure given below. This procedure finds 
D JP-contractible subgraphs Gi , . . . , Gfc in which edges are progressively being 
corrupted by the Soft Heap. Let Me- contain only those corrupted edges with 
one endpoint in G^ at the time it is completed. 

Each subgraph Ci will be DJP-contractible w.r.t a graph derived from G 
by several rounds of contractions and edge deletions. When Ci is finished it is 
contracted and all incident corrupted edges are discarded. By applying Lemma 
0 repeatedly we see that after Ci is built, the MSP of G is a subset of 

i / i i \ i 

U MSF{Cj) U MSF G\ U G, - U Me, U |J Me, 

\ / i=i 

Below, arguments appearing before the semicolon are inputs; the outputs will 
be returned in the other arguments. M is a set of edges and C={Gi, . . . , Ck) is 
a set of subgraphs of G. No edge will appear in more than one of M, Gi, . . . , Gfe. 

Partition(G, maxsize, e ; M, C) 

All vertices are initially ‘‘live’’ 

M := 0 
i ■- 0 

While there is a live vertex 

Increment i 

Let Vi := {w}, where v is any live vertex 

Create a Soft Heap consisting of v’s edges (uses e) 

While all vertices in Vi are live and \Vi\ < maxsize 
Repeat 

Find and delete min-weight edge {x,y) from Soft Heap 
Until y ^ Vi (Assume w.l.o.g. x £Vi) 

Vi := Ei U {y} 

If y is live, insert each of y’s edges into the Soft Heap 

Set all vertices in Vi to be dead 

Let Mvi be the corrupted edges with one endpoint in Vi 

M ■- MU Mv,; G:=G-Mv, 

Dismantle the Soft Heap 

Let C := {Cl, . . . ,Gi} where Cz is the subgraph induced by Vz 
Exit . 

Initially, Partition sets every vertex to be live. The objective is to convert 
each vertex to dead, signifying that it is part of a component C'i with < maxsize 
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vertices and part of a conglomerate of > maxsize vertices, where a conglomerate 
is a connected component of the graph [_}E(Ci). Intuitively a conglomerate is a 
collection of C^’s linked by common vertices. This scheme for growing compo- 
nents is similar to the one given in |FT87| . 

We grow the CiS one at a time according to the DJP algorithm, except that 
we use a Soft Heap. A component is done growing if it reaches maxsize vertices 
or if it attaches itself to an existing component. Clearly if a component does 
not reach maxsize vertices, it has linked to a conglomerate of at least maxsize 
vertices. Hence all its vertices can be designated dead. Upon completion of a 
component Cj, we discard the set of corrupted edges with one endpoint in Ci. 

The running time of Partition is dominated by the heap operations, which 
depend on e. Each edge is inserted into a Soft Heap no more than twice (once 
for each endpoint), and extracted no more than once. We can charge the cost of 
dismantling the heap to the insert operations which created it, hence the total 
running time is 0(7nlog(i)). The number of discarded edges is bounded by the 
number of insertions scaled by e, thus |M| < 2em. Thus we have 
Lemma 4. Given a graph G, any 0 < e < and a parameter maxsize, Parti- 
tion finds edge-disjoint subgraphs M, Ci, . . . ,Ck in time 0{\E{G) \ ■ log(i)) while 
satisfying several conditions: 

a) For all v € V{G) there is some i s.t. v € V{Ci). 

b) For all i, \V{Ci)\ < maxsize. 

c) For each conglomerate P € UiGi, |U(P)| > maxsize. 

d) \E{M)\ < 2e- |E(G)| 

e) MSF{G) C MSF{Ci) U MSF{G\{[J. Q) - M) U M 

4 Overview of the Optimal Algorithm 

Here is an overview of our optimal MSP algorithm. 

— In the first stage we find DJP-contractible subgraphs Ci,C 2 , . ■ . ,Ck with 
their associated set of edges M = [j- Me,, where Me, consists of corrupted 
edges with one endpoint in Ci. 

— In the second stage we find the MSP P) of each Ci, and the MSP Fq of the 
contracted graph G\((J- Cf) — (J- Me, - By Lemma 0 the MSP of the whole 
graph is contained within PoUUi(^i'-JMcJ. Note that at this point we have 
not identified any edges as being in the MSP of the original graph G. 

— In the third stage we find some MSP edges, via Boruvka steps, and recurse 
on the graph derived by contracting these edges. 

We execute the first stage using the Partition procedure described in the 
previous section. 

We execute the second stage with optimal decision trees. Essentially, these 
are hardwired algorithms designed to compute the MSP of a graph using an 
optimal number of edge-weight comparisons. In general, decision trees are much 
larger than the size of the problem that they solve and finding optimal ones 
is very time consuming. We can afford the cost of building decision trees by 
guaranteeing that each one is extremely small. At the same time, we make each 
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conglomerate formed by the Ci to be sufficiently large so that the MSF Fq of the 
contracted graph can be found in linear time using the DenseCase algorithm. 

Finally, in the third stage, we have a reduction in vertices due to the Boruvka 
steps, and a reduction in edges due to the application of Lemma 0 In our opti- 
mal algorithm both vertices and edges reduce by a constant factor, thus resulting 
in the recursive applications of the algorithm on graphs with geometrically de- 
creasing sizes. 

4.1 Decision Trees 

An MSF decision tree is a rooted binary tree having an edge-weight comparison 
associated with each internal node (e.g. weight{x,y) < weight{w, z)). The left 
child represents that the comparison is true, the right child that it is false. 
Associated with each leaf is a spanning forest. An MSF decision tree is correct 
if the edge-weight comparisons encountered on any path from the root to a leaf 
uniquely identify the spanning forest at that leaf as the MSF. A decision tree is 
optimal if it is correct and there exists no correct decision tree with lesser depth. 

Using brute force search, the optimal MSF decision trees for all graphs on 
< log^^^ n vertices may be constructed and checked in o{n) time. 

Our algorithm we will use a procedure DecisionTree(5; F), which takes as 
input a collection of graphs G, each with at most log^^^ n vertices, and returns 
their minimum spanning forests in T using the precomputed decision trees. 

5 The Algorithm 

As discussed above, the optimal MSF algorithm is as follows. First, precompute 
the optimal decision trees for all graphs with < log*-^^ n vertices. Next, divide the 
input graph into subgraphs Ci, C 2 , ..., Cfe, discarding the set of corrupted edges 
Mci as each Ci is completed. Use the decision trees found earlier to compute 
the MSF Fi of each Ct, then contract each connected component spanned by 
Fi U . . . U Ffc (i.e., each conglomerate) into a single vertex. The resulting graph 
has < n/log*-^^ n vertices since each conglomerate has at least log^^^ n vertices 
by Lemma 0 Hence we can use the DenseCase algorithm to compute its MSF 
Fq in time linear in m. At this point, by Lemma 0 the MSF is now contained in 
the edge set Fq U . . . U U U . . . ■ On this graph we apply two Boruvka 

steps and then compute its MSF recursively. The algorithm is given below. 

OptimalMSF(G) 

If E{G) = 0 then Return(0) 

Let e := 1/8 and maxsize := |U(G)| 

Partition(G, maa;si 2 e, e; M,C) 

DecisionTree(C; F) 

Let k ~ \C\ and let C = {Gi, . . . , Gj,} , F — {Fi, . . . , Fk} 

Ga ■■= G\(Fi U . . . U Ffc) - M 
DenseCase(Ga; Fq) 

Gb := Fo U Fi U . . . U Ffc U M 
Boruvka2(G6; F',Gc) 

F := DptimalMSF(Gc) 

Return (F L) F') 
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Apart from recursive calls and using the decision trees, the computation 
performed by OptimalMSF is clearly linear since Partition takes 0(m log(^)) 
time, and owing to the reduction in vertices, the call to DenseCase also takes 
linear time. For e = |, the number of edges passed to the final recursive call is 
< m/4 + n/4 < m/2, giving a geometric reduction in the number of edges. Since 
no MSF algorithm can do better than linear time, the bottleneck, if any, must 
lie in using the decision trees, which are optimal by construction. 

More concretely, let T{m,n) be the running time of OptimalMSF. Let 
T*{m,n) be the optimal number of comparisons needed on any graph with 
n vertices and m edges and let T*{G) be the optimal number of comparisons 
needed on a specific graph G. The recurrence relation for T is given below. For 
the base case note that the graphs in the recursive calls will be connected if 
the input graph is connected. Hence the base case graph has no edges and one 
vertex, and we have T(0, 1) equal to a constant. 

T{m, n) < ^ T*{Gi) + T(to/ 2, n/4) + ci • m 

i 

It is straightforward to see that if T*{m,n) = 0{m) then the above recur- 
rence gives T(m, n) = 0(m). One can also show that T{m, n) = 0{T*{m, n)) for 
many natural functions for T* (including m ■ a{m,n)). However, to show that 
this result holds no matter what the function describing T*(jn,n) is, we need 
to establish some results on the decision tree complexity of the MSF problem, 
which we do in the next section. 

5.1 Some Results for MSF Decision Trees 

In this section we establish some results on MSF decision trees that allow us to 
establish our main result that OptimalMSF runs in 0{T*{m,n)) time. 

Proving the following propositions is straightforward. 

Proposition 1. T*{m,n) > mj^. 

Proposition 2. For fixed m and n' > n, T*{m,n') > T*{m,n). 

Proposition 3. For fixed n and m! > m, T*{m\n) > T*{m,n). 

We now state a property that is used by Lemmas El and 0 

Property 1. The structure of G dictates that MSF(G) = MSF(Gi) U ...U 
MSF(Gfe), where Gi, . . . , are edge-disjoint subgraphs of G. 

If Gi, . . . , Gfe are the components returned by Partition, it can be seen that 
the graph Gi satisfies Definition 0 since every simple cycle in this graph must 
be contained in exactly one of the Ci. 

The proof of the following lemma can be found in jPR 99b| . 

Lemma 5. If Property^holds for G, then there exists an optimal MSF decision 
tree for G which makes no comparisons of the form e < f where e £ Ci, f G Cj 
and i ^ j ■ 



Lemma 6. If Property^ holds for G, then T*{G) = Yl,i T*{Ci). 
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Proof. (Sketch) Given an optimal decision tree T for G as in Lemma|3 we show 
that T can be transformed into a ‘canonical’ decision tree T' for G of the same 
height such that in T' , comparisons for Ci precede comparisons for Gi+i, for each 
i, and further, the subgraph of T' containing the comparisons within Ci consists 
of isomorphic trees. This establishes the desired result since T' must contain a 
path that is the concatenation of the longest path in an optimal decision tree 
for each of the Ci. For details see EEMa. 

Corollary 1. Let the Ci he the components formed by Partition when applied 
to G. Then Y.^r*{Ci) =T*{G) < T*{m,n). 

Corollary 2. For any m and n, 2 • T*{m, n) < T*{2m, 2n) 

We can now solve the recurrence relation given in the previous section. 

T{m,n) <'^r*{C,)+ T(m/2, n/4) + ci • m 

i 

< T*{m,n) + T{mf2,nlA) + ci • m (Corollary^ 

< T*{m, n) + c - T*(m/2, n/4) + ci ■ m (assume inductively) 

< T*(m,n)(l + c/2 + 2ci) (Corollary\^ and Propositions^ 0) 

< c- T*{m,n) (for sufficiently large c; this completes the induction) 

This gives us the desired theorem. 

Theorem 1. Let T*{rn,n) be the decision-tree complexity of the MSF problem 
on graphs with m edges and n nodes. Algorithm OptimalMSF computes the MSF 
of a graph with m edges and n vertices deterministically in 0{T*(jn,n)) time. 

6 Performance on Random Graphs 

Even if we assume that MST has some super-linear complexity, we show below 
that our algorithm runs in linear time for nearly all graphs, regardless of edge 
weights. This improves upon the expected linear-time result of Karp and Tarjan 
, which depended on the edge weights being chosen randomly. Our result 
may also be contrasted with the randomized algorithm of Karger et al. IkkThni . 
which is shown to run in 0(m) time w.h.p. by a proof that depends on the 
permutation of edge weights and random bits chosen, not the graph topology. 
Throughout this section a will denote a{m,n). Most proofs in this section are 
omitted due to lack of space. 

Theorem 2. With probability 1 — ^ ^/jg mST of a graph drawn from 

Gn,m can be found in linear time, regardless of the permutation of edge weights. 

In the next section we describe the edge-addition martingale for the Gn,m 
model. In section lO we use this martingale and Azuma’s inequality to prove 
Theorem El 

6.1 The Edge-Addition Martingale 

We use the Gn,m random graph model, that is, each graph with n labeled vertices 
and m edges is equally likely (the result can be extended to Gn,p). For analytical 
purposes, we select a random graph by beginning with n vertices and adding one 
edge at a time fEbblj . Let Xi be a random edge s.t. Xi Xj for j < i, and 
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Gi = {Xi, . . . ,Xi} be the graph made up of the first i edges, with Go being the 
graph on n vertices having no edges. 

We prove that if g is any graph-theoretic function and gsiGi) = E[g(Gm) \Gi], 
then gE{Gi), for 0 < f < m is a martingale. We call this the edge-addition mar- 
tingale in contrast to the edge- expo sure martingale for Gn,p- 

A martingale is a sequence of random variables Yq,Yi, . . . ,Ym such that 
E[F, I r,_i] = Yi_i for 0 < i < TO. 

Lemma 7. The sequence gsiGi) = Yi[g{Gm) \ Gi], for 0 < i < m, is a mar- 
tingale, where g is any graph theoretic function, Gq is the edge-free graph on n 
vertices, and Gi is derived from Gi-i by adding a random edge not in Gi-i to 

G, _i. 

We now recall the well-known Azuma’s inequality (see, e.g., [IASt)2] ). 

Theorem 3. (Azuma’a Inequality.) Let Yq, - ■ ■ ,Ym be a martingale with |Yi — 
Yi-i\ < 1 for 0 < i < m. Let \ > 0 be arbitrary. Then Pr[|Wn, — Lo| > < 

To facilitate the application of Azuma’s inequality to the edge-addition mar- 
tingale we give the following lemma. 

Lemma 8. Consider the sequence proved to be a martingale in Lemma Q Let 
g be any graph-theoretic function such that \g{G) — g{G')\ < 1 for any pair of 
graphs G and G' of the form G = H U {e} and G' = H U {e'}, for some graph 

H. Then IgsiGt) - gE{Gi-i)\ < 1, for 0 < i < m. 

6.2 Analysis 

We define the excess of a subgraph H to be \E{H)\ — \F{H)\, where F{H) is any 
spanning forest of H. Let f{G) be the maximum excess of the graph made up of 
intra- component edges, where the sets of components range over all possible sets 
returned by the Partition procedure. (Recall that the size of any component is no 
more than k = maxsize = log^^^ n.) 

Our key observation is that each pass of our optimal algorithm definitely 
runs in linear time if /(G) < m/a{m,n). To see this, note that if this bound 
on /(G) holds, we can reduce the total number of intra-component edges to 
< 2m! a in linear time using log a Boruvka steps, and then, clearly, the MST of 
the resulting graph can be determined in 0(m) time. We show below that if a 
graph is randomly chosen from Gn,m, f{G) < m/a{m,n) with high probability. 

Define fE{Gi) = E[/(Gm)|Gi]. The following lemma gives a bound on fE{Go)', 
its proof is straightforward. 

Lemma 9. /b(Go) = o(m/a). 

The following two lemmas establish the application of Azuma’s inequality to 
the graph-theoretic function /. 

Lemma 10. Let G = H U {e} and G' = H U {e'} be two graphs on a set of 
labeled vertices which differ by no more than one edge. Then |/(G) — /(G')| < 1. 

Lemma 11. Let G be chosen from Gn,m- Then Pr[/(G) > m/a] < ). 
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Proof, (of Theorem 2.) We examine only the first logfc passes of our optimal 
algorithm, since all remaining passes certainly take o{m) time. T ^em m a, nTl assi i res 
us that the first pass runs in linear time w.h.p. However, the topology of the 
graph examined in later passes does depend on the edge weights. Assuming the 
Boriivka steps contract all parts of the graph at a constant rate, which can easily 
be enforced, a partition of the graph in one pass of the algorithm corresponds to 
a partition of the original graph into components of size less than for some 
fixed c. Using k‘^ in place of k does not affect Lemma|51 which gives the Theorem. 

7 Discussion 

An intriguing aspect of our algorithm is that we do not know its precise deter- 
ministic running time although we can prove that it is within a constant factor 
of optimal. Results of this nature have been obtained in the past for sensitiv- 
ity analysis of minimum spanning trees iPRT92! and convex matrix searching 
Also, for the problem of triangulating a convex polygon, it was observed 
in pP!T92] that an alternate linear-time algorithm could be obtained using opti- 
mal decision trees on small subproblems. However, these earlier algorithms make 
use of decision trees in more straightforward ways than the algorithm presented 
here. 

As noted earlier, the construction of optimal decision trees takes sub-linear 
time. Thus, it is important to observe that the use of decision trees in our 
algorithm does not result in a large constant factor in the running time, nor 
does it result in an algorithm that is non-uniform. 

It should be noted that the existence of a linear-time verification algorithm 
for MST immediately implies a naive optimal MST algorithm that is obtained 
by enumerating all possible algorithms, evaluating them incrementally, and ver- 
ifying the outputs until we encounter the correct output. However, the constant 
factor for this algorithm is astronomical, and it sheds no light on the relationship 
between the algorithmic and decision-tree complexities of the problem. 

8 Conclusion 

We have presented a deterministic MSF algorithm that is provably optimal. The 
algorithm runs on a pointer machine, and on graphs with n vertices and m edges, 
its running time is 0(jT*{m, n)), where T*(rn, n) is the decision-tree complexity 
of the MSF problem on n-node, m-edge graphs. Also, on random graphs our 
algorithm runs in linear time with high probability for all possible edge-weights. 
Although the exact running time of our algorithm is not known, we have shown 
that the time bound depends only on the number of edge-weight comparisons 
needed to determine the MSF, and not on data structural issues. 

Pinning down the function that describes the worst-case complexity of our 
algorithm is the main open question that remains for the sequential complexity 
of the MSF problem. A related question is the parallel work-time complexity of 
this problem. In this context, resolved recently were the randomized work-time 
complexity [PRM| and the deterministic time complexity |CHL99| of the MSF 
problem on the EREW PRAM. An open question that remains here is to obtain 
a deterministic work-time optimal parallel MSF algorithm. 
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Abstract. Thorup recently showed that single-source shortest-paths 
problems in undirected networks with n vertices, m edges, and edge 
weights drawn from {0, . . . , 2™ — 1} can be solved in 0{n -\- m) time and 
space on a unit-cost random-access machine with a word length of w 
bits. His algorithm works by traversing a so-called component tree. Two 
new related results are provided here. First, and most importantly, Tho- 
rup’s approach is generalized from undirected to directed networks. The 
resulting time bound, 0{n -|- mlogw), is the best deterministic linear- 
space bound known for sparse networks unless w is superpolynomial 
in logn. As an application, all-pairs shortest-paths problems in directed 
networks with n vertices, m edges, and edge weights in {—2“', . . . , 2™} 
can be solved in 0{nm -|- log logn) time and 0(n -|- m) space (not 
counting the output space). Second, it is shown that the component tree 
for an undirected network can be constructed in deterministic linear time 
and space with a simple algorithm, to be contrasted with a complicated 
and impractical solution suggested by Thorup. Another contribution of 
the present paper is a greatly simplified view of the principles underlying 
algorithms based on component trees. 



1 Introduction 

The single-source shortest-paths (SSSP) problem asks, given a network Af with 
real- valued edge lengths and a distinguished vertex s in called the source^ for 
shortest paths in J\f from s to all vertices in J\f for which such shortest paths 
exist. It is one of the most fundamental and important network problems from 
both a theoretical and a practical point of view. Actually, the more fundamental 
and important problem is that of finding a shortest path from s to a single given 
vertex t, but this does not appear to be significantly easier than solving the 
complete SSSP problem with source s. 

This paper considers mainly the important special case of the SSSP prob- 
lem in which all edge lengths are nonnegative. The classic algorithm for this 
special case is Dijkstra’s algorithm ISEEg. Dijkstra’s algorithm maintains for 
every vertex v in Af & tentative distance from s to v, processes the vertices 
one by one, and always selects as the next vertex to be processed one whose 
tentative distance is minimal. The operations that need to be carried out on 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 61-[7^ 2000. 
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the set of unprocessed vertices — in particular, identifying a vertex with mini- 
mal key (= tentative distance) — are supported by the priority-queue data type, 
and therefore an efficient implementation of Dijkstra’s algorithm essentially boils 
down to an efficient realization of the priority queue. 

Suppose that the graph underlying Af is G = (V,E) and take n = \V\ and 
m = \E\. An implementation of Dijkstra’s algorithm with a running time of 
0{n-\-mlogn) is obtained by realizing the priority queue through a binary heap 
or a balanced search tree. Realizing the priority queue by means of a Fibonacci 
heap, Fredman and Tarjan lowered the time bound to 0{m nlogn). The 
priority queues mentioned so far are comparison-based. In reality, however, edge 
lengths are numeric quantities, and very frequently, they are or can be viewed 
as integers. In the remainder of the paper, we make this assumption, which lets 
a host of other algorithms come into play. 

Our model of computation is the word RAM, which is like the classic unit- 
cost random-access machine, except that for an integer parameter w > 1 called 
the word length, the contents of all memory cells are integers in the range 
{0,...,2’“ — 1}, and that some additional instructions are available. Specifi- 
cally, the available unit-time operations are assumed to include addition and 
subtraction, (noncyclic) bit shifts by an arbitrary number of positions, and bit- 
wise boolean operations, but not multiplication (the restricted instruction set). 
Our algorithm for undirected networks in addition assumes the availability of 
a unit-time “most-significant-bit” (MSB) instruction that, applied to a positive 
integer r, returns [logrj (all logarithms are to base 2). When considering an 
instance of the SSSP problem with n vertices, we assume that w > log n, since 
otherwise n is not a representable number. In the same vein, when nothing else 
is stated, edge weights are assumed to be integers in the range {0, . . . , 2“ — 1}. 

We now discuss previous algorithms for the SSSP problem that work on the 
word RAM, focusing first on deterministic algorithms. A well-known data struc- 
ture of van Emde Boas et al. m is a priority queue that allows insertion and 
deletion of elements with keys in {0, . . . , C} as well as the determination of an 
element with minimal key in O(loglogC) time per operation. This implies an 
SSSP algorithm with a running time of 0{n mlogw). In more recent work, 
Thorup US) improved this to 0(n-|-m log log n), the best bound known for sparse 
networks. Both algorithms, however, use space, which makes them imprac- 

tical if w is larger than log n by a nonnegligible factor. A different algorithm by 
Thorup [2Dj achieves 0{n m(loglogn)^) time using linear space, 0{n -\- m). 
Algorithms that are faster for denser networks were indicated by Ahuja et al. |2|, 
Cherkassky et al. 0, and Raman mW; their running times are of the forms 
0(to -I- n(logn)®^^)) and 0{m-\- Some of these algorithms employ ran- 

domization, multiplication, and/or superlinear space. Using randomization, an 
expected running time of 0{n-\-mloglogn) can be achieved in linear space |l t)j . 

Our first result is a new deterministic algorithm for the SSSP problem that 
works in 0{n mlogw) time. The time bound is never better than that of 
Thorup HE). The new algorithm, however, works in linear space. For sparse 
networks, the new algorithm is faster than all previous deterministic linear- 
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space solutions unless w is superpolynomial in log n. We actually prove a more 
general result: If the edges can be partitioned into b > 2 groups such that the 
lengths of the edges within each group differ by at most a constant factor, the 
SSSP problem can be solved in 0(nloglogn + mlog5) time. Our construction 
implies that the all-pairs shortest-paths (APSP) problem in networks with n 
vertices, m edges, and edge weights drawn from {—2“', . . . , 2*"} can be solved in 
0{nm + loglogn) time and 0{n + m) space (not counting the output space). 
No faster APSP algorithm is known for any combination of n and m. If m = w(n) 
and m = o(n log n), the new algorithm is faster than all previous algorithms. 

In a remarkable development, Thorup m showed that the SSSP problem 
can be solved in linear time and space for undirected networks. His algorithm 
is all the more interesting in that it is not a new implementation of Dijkstra’s 
algorithm. The vertices are still processed one by one, but the strict processing 
in the order of increasing tentative distance is abandoned in favor of a more 
permissive regime, the computation being structured with the aid of a so-called 
component tree. Thorup provides two algorithms for constructing the component 
tree. One uses the Q-heap data structure of Fredman and Willard ^ and works 
in 0{n + m) time, but is complicated and utterly impractical. The other one is 
simple and conceivably practical, but its running time is 0{ma{m,n)), where a 
is an “inverse Ackermann” function known from the analysis of a union-find data 
structure ini. Our second result is a procedure for computing the component tree 
of an undirected network that is about as simple as Thorup’s second algorithm, 
but works in linear time and space. 



2 Shortest Paths in Directed Networks 

This section proves our main result: 

Theorem 1. For all positive integers n, m and w with w > logn > 1, single- 
source shortest-paths problems in networks with n vertices, m edges, and edge 
lengths in the range {0, ... ,2*" — 1} can be solved in 0{n -\- m logic) time and 
0{n -\- m) space on a word RAM with a word length of w bits and the restricted 
instruction set. 

Let us fix a network Af consisting of a directed graph G = (V,E) and a length 
function c : if — {0, . . . , 2“ — 1} as well as a source s G P and take n = \V\>2 
and 771 = |if|. We assume without loss of generality that G is strongly connected, 
i.e., that every vertex is reachable from every other vertex. Then m > n, and we 
can define S{v) as the length of a shortest path in G from s to v, for all v G V. 
It is well-known that knowledge of 6{v) for all c G P allows us, in 0{m) time, 
to compute a shortest-path tree of G rooted at s (see, e.g., P Section 4.3]), so 
our task is to compute <5(c) for all c G P. 

Dijkstra’s algorithm for computing 6{v) for all c G P can be viewed as 
simulating a fire starting at s at time 0 and propagating along all edges at unit 
speed. The algorithm maintains for each vertex v an upper bound d[c] on the 
(simulated) time S{v) when v will be reached by the fire, equal to the time when 
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V will be reached by the fire from a vertex u already on fire with (u, v) G E (oo if 
there is no such vertex). Whenever a vertex u is attained by the fire (u is visited), 
the algorithm reconsiders d[u] for each unvisited vertex v with (u, v) G E and, if 
the current estimate d[u] is larger than the time d[u] + c(u, v) when v will be hit 
by the fire from u, decreases d[v\ to d[u] + c{u,v)] in this case we say that the 
edge (u, v) is relaxed. The simulated time is then stepped to the time of the next 
visit of a vertex, which is easily shown to be the minimal d value of an unvisited 
vertex. 

Consider a distributed implementation of Dijkstra’s algorithm in which each 
vertex v is simulated by a different processor P„. The relaxation of an edge (u, v) 
is implemented through a message sent from to and specifying the new 
upper bound on 5{v). For each v G V, Py receives and processes all messages 
pertaining to relaxations of edges into v, then reaches the simulated time d[u] and 
visits V, and subsequently sends out an appropriate message for each edge out of 

V that it relaxes. The implementation remains correct even if the processors do 
not agree on the simulated time, provided that each message is received in time: 
For each vertex v, a message specifying an upper bound of t on S{v) should be 
received by P„ before it advances its simulated time beyond t. If such a message 
corresponds to the relaxation of an edge e = (u, v), it was generated and sent by 
P„ at its simulated time t — c{e). Provided that messages have zero transit times, 
this shows that for all e = {u, v) G E, we can allow the simulated time of Pu to 
lag behind that of P„ by as much as c(e) without jeopardizing the correctness 
of the implementation. In order to capitalize on this observation, we define a 
component tree as follows. 

Take the level of each edge e S P to be the integer i with 2*“^ < c(e) < 2® 
if c(e) > 0, and 0 if c(e) = 0. For each integer i, let Gi be the subgraph of G 
spanned by the edges of level at most i. A component tree for Af is a tree T, 
each of whose nodes x is marked with a level in the range {—1, . . . , w}, level{x), 
and a priority in the range {0, . . . ,n — 1}, priority{x), such that the following 
conditions hold: 

1. The leaves of T are exactly the vertices in G, and every inner node in T has 
at least two children. 

2. Let X and y be nodes in T, with x the parent of y. Then level(x) > level{y). 

3. Let u and v be leaf descendants of a node x in T. Then there is a path from 

u to V in Gigy^px)- 

4. Let u and v be leaf descendants of distinct children y and z, respectively, of 

a node x in T. Then priority {y) < priority (z) or there is no path from u to 

^level(x) — l- 

The component tree is a generalization of the component tree of Thorup izq. 
Let T = {Vt,Et) be a component tree for Af and, for all a; € Vr, let Gx be 
the subgraph of Gi^y^px) spanned by the leaf descendants of x. The conditions 
imposed above can be shown to imply that for every x G Vr, Gx is a strongly 
connected component (SCC) of Gievepx), he., a maximal strongly connected sub- 
graph of Gievel(x)- 
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We carry out a simulation of the distributed implementation discussed above, 
imagining the processor as located at the leaf v of T, for each v & V . The rest 
of T serves only to enforce a limited synchronization between the leaf processors, 
as follows: For each integer i, define an i-interval to be an interval of the form 
[r-2*, (r + 1) -2*) for some integer r > 0. Conceptually, the algorithm manipulates 
tokens, where an z-token is an abstract object labeled with an z-interval, for each 
integer z, and a token is an z-token for some i. A nonroot node a: in T occasionally 
receives from its parent a token labeled with an interval I, interpreted as a 
permission to advance its simulated time across I. If x has level z and is not a leaf, 
it then splits I into consecutive (z — l)-intervals I\, . . . ,1^ and, for j = 1, . . . , k, 
steps through its children in an order of nondecreasing priorities and, for each 
child y, sends an (z — l)-token labeled with Ij to y and waits for a completion 
signal from y before stepping to its next child or to the next value of j. Once 
the last child of x has sent a completion signal for the last token to x, x sends 
a completion signal to its parent. 

The root of T behaves in the same way, except that it neither generates 
completion signals nor receives tokens from a parent; we can pretend that the 
root initially receives a token labeled with the interval [0,oo). A leaf node v, 
upon receiving a token labeled with an interval / from its parent, checks whether 
(i[v] G / and, if so, visits v and relaxes all edges leaving v that yield a smaller 
tentative distance. No “relaxation messages” need actually be generated; instead, 
the corresponding decreases of d values are executed directly. Similarly, although 
the simulation algorithm was described above as though each node in T has its 
own processor, it is easily turned into a recursive or iterative algorithm for a 
single processor. 

Consider a relaxation of an edge (zz, v) € E and let x, y, and 2 be as in 
condition (4) in the definition of a component tree. Then either priority (y) < 
priority {z), in which case the simulated time of Pu never lags behind that of P„, 
or c{u,v) > [2*“^J, where z = level{x). Since the synchronization enforced by 
X never allows the simulated times of two processors at leaves in its subtree to 
differ by more than 2*“^, our earlier considerations imply that the simulation is 
correct, i.e., the value of 5{v) is computed correctly for all v gV. 

As described so far, however, the simulation is not efficient. It is crucial not 
to feed tokens into a node a; in T before the first token that actually enables a 
leaf descendant of x to be visited, and also to stop feeding tokens into x after 
the last leaf descendant of x has been visited. Thus each node x of T initially is 
dormant, then it becomes active, and finally it becomes exhausted (except for 
the root of T, which is always active). The transition of x from the dormant 
to the active state is triggered by the parent of x producing a token labeled 
with an interval that contains d\x], defined to be the smallest d value of a leaf 
descendant of x. When the last leaf descendant of x has been visited, on the other 
hand, x notifies its parent that it wishes to receive no more tokens and enters 
the exhausted state. If x is an inner node, this simply means that x becomes 
exhausted when its last child becomes exhausted. 
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The following argument of Thorup ED shows the total number of tokens 
exchanged to be 0(m): The number of tokens “consumed” by a node a; in T of 
level i is at most 1 plus the ratio of the diameter of to 2b The “contribution” 
of a fixed edge in E to the latter ratio, at various nodes a: on a path in T, is 
bounded by ~ Since T has fewer than 2n nodes, the total number of 

tokens is bounded by 2n + 2m. In order to supply its children with tokens in the 
right order without violating the constraints implied by the children’s priorities, 
each inner node in T initially sorts its children by their priorities. Using two- 
pass radix sort, this can be done together for all inner nodes in 0(n) total time. 
Taking the resulting sequence as the universe, each inner node subsequently 
maintains the set of its active children in a sorted list and, additionally, in a van 
Emde Boas tree The sorted list allows the algorithm to step to the next 
active child in constant time, and the van Emde Boas tree allows it to insert 
or delete an active child in O(loglogn) time. As there are 0{n) insertions and 
deletions of active nodes over the whole simulation, the total time needed is 
0{m + n log log n) = 0{mlogw). Since it is not difficult to implement the van 
Emde Boas tree in space proportional to the size of the universe IS], the total 
space needed by all instances of the data structure is 0(n). 

If we ignore the time spent in constructing T and in discovering nodes that 
need to be moved from the dormant to the active state, the running time of the 
algorithm is dominated by the contribution of 0(m logic) identified above. We 
now consider the two remaining problems. 

2.1 Constructing the Component Tree 

We show how to construct the component tree T in 0(m min{n, log ic}) time, 
first describing a simple, but inefficient algorithm. 

The algorithm maintains a forest F that gradually evolves into the component 
tree. Initially F = (U,0), i.e., F consists of n isolated nodes. Starting from a 
network Af-i that also contains the elements of V as isolated vertices and no 
edges, the algorithm executes ic-l-l stages. In Stage j, for j = 0, . . . , ic, a network 
AG- is obtained from Afj-i by inserting the edges in J\f of level j, computing the 
SCCs of the resulting network, and contracting the vertices of each nontrivial 
see to a single vertex. In F, each contraction of the vertices in a set U is 
mirrored by creating a new node that represents U, giving it level j, and making 
it the parent of each node in U. Suitable priorities for the vertices in U are 
obtained from a topological sorting of the (acyclic) subgraph of AG-i spanned 
by the vertices in U. So that the remaining edges can be inserted correctly later, 
their endpoints are updated to reflect the vertex contractions carried out in 
Stage j. The resulting tree is easily seen to be a component tree. 

Assuming that w < m, we lower the construction time from 0(rnw) to 
0(m logic) by carrying out a preprocessing step that allows each stage to be 
executed with only the edges essential to that stage. For each e = (ii,u) G E, 
define the essential level of e to be the unique integer iG{— l,...,u; — 1} such 
that u and v belong to distinct SCCs in Gi, but not in G^+i. Starting with 
E-i = E and Eq = Ei = ■ ■ ■ = Eyj-i = 0, the following recursive algorithm. 
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which can be viewed as a batched binary search, stores in Ei the subset of the 
edges in E of essential level i, for i = —1,. — 1. The outermost call is 
BatchedSearch{—l, w). 

procedure BatchedSearch{i, k): 
it k — i >2 then 

j ■- l(i + k)/2\; 

Let Afj be the subnetwork of Af spanned by the edges in Ei of level < j; 

Compute the SCCs of Nj\ 

Move from Ei to Ej each edge with endpoints in distinct SCCs of A/}; 

Contract each SCC of Afj to a single vertex 

and rename the endpoints of the edges in Ej accordingly; 

BatchedSearch{i, j); 

BatchedSearch{j, k); 

The calls of BatchedSearch form a complete binary call tree of depth O(logw). 
If a call BatchedSearch{i, k) is associated with its lower argument I, each edge 
can be seen to belong to only one set Ei whose index i is associated with a 
call at a fixed level in the call tree. Since all costs of a call BatchedSearch{i, k), 
exclusive of those of recursive calls, are 0(1 + \Ei\), the execution time of the 
algorithm is 0{w + mlogw) = Oijnlogw). Moreover, it can be seen that at the 
beginning of each call BatchedSearch{i, k), Ei contains exactly those edges in E 
whose endpoints belong to distinct SCCs in Gi, but not in Gk- Applied to the 
leaves of the call tree, this show the output of BatchedSearch to be as claimed. 

We now use the original, simple algorithm, with the following modifications: 
(1) Instead of renaming edge endpoints explicitly, we use an efficient union- find 
data structure to map the endpoints of an edge to the nodes that resulted from 
them through a sequence of node contractions. Over the whole construction, the 
time needed for this is 0{ma{m,n)) = 0{m + nloglogn) = 0{mlogw) m- (2) 
In Stage j, for j = 0, . . . , w, JVj is obtained from Afj-i by inserting each edge in 
Af that was not inserted in an earlier stage, whose endpoints were not contracted 
into a common node, and whose level and essential level are both at most j. By 
the definition of the essential level of an edge, each edge disappears through a 
contraction no later than in the stage following its insertion, so that the total 
cost of the algorithm, exclusive of that of the union-find data structure, comes 
to 0{m). On the other hand, although the insertion of an edge may be delayed 
relative to the original algorithm, every edge is present when it is needed, so 
that the modified algorithm is correct. 

We now sketch how to construct the component tree in 0{nm) time when 
w > n. Again, the basic approach is as in the simple algorithm. Starting with 
a graph that contains the elements of V as vertices and no edges, we insert the 
edges in E in an order of nondecreasing levels into a graph iL , keeping track of the 
transitive closure of as we do so. The transitive closure is represented through 
the rows and columns of its adjacency matrix, each of which is stored in a single 
word as a bit vector of length n. It is easy to see that with this representation, 
the transitive closure can be maintained in 0(n) time per edge insertion. After 
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each insertion of all edges of a common level, we pause to compute the strongly 
connected components, contract each of these and insert a corresponding node in 
a partially constructed component tree. This can easily be done in 0{kn) time, 
where k is the number of vertices taking part in a contraction. Over the whole 
computation, the time sums to 0{nm). 



2.2 Activating the Nodes in the Component Tree 

In order to be activated at the proper time, each nonroot node y in T needs 
to place a “wakeup request” with its parent x. To this effect, each active node 
X in r, on level i, say, is equipped with a calendar containing a slot for each 
(i — l)-interval of simulated time during which x is active. The calendar of x is 
represented simply as an array of (pointers to) linked lists, each list containing 
entries of all children of x requesting to be woken up at the corresponding (i — 1)- 
interval. Since the total number of tokens exchanged is 0{m), calendars of total 
size 0{m) suffice, and the calendar of a node x can be allocated when x becomes 
active. 

The wakeup mechanism requires us to maintain d[y\ for each dormant node y 
in T with an active parent x] let us call such a node pre-active. We describe below 
how to compute d[y] at the moment at which y becomes pre-active. Subsequent 
changes to d[y] , up to the point at which y becomes active, are handled as follows: 
Whenever d[v] decreases for some vertex v G V, we locate the single pre-active 
ancestor y of u (how to do this is also described below) and, if appropriate, move 
the entry of y in the calendar of its parent to a different slot (in more detail, the 
entry is deleted from one linked list and inserted in another). 

We list the leaves in T from left to right, calling them points, and associate 
each node in T with the interval consisting of its leaf descendants. When a node 
becomes pre-active, it notifies the last point v in its interval of this fact — we 
will say that v becomes a leader. Now the pre-active ancestor of a point can 
be determined by finding the leader of the point, the nearest successor of the 
point that is a leader. In order to do this, we divide the points into intervals 
of w points each and maintain for each interval a bit vector representing the 
set of leaders in the interval. Moreover, we keep the last point of each interval 
permanently informed of its current leader. Since T is of depth 0{w) and the 
number of intervals is 0{n/w), this can be done in 0(n) overall time, and now 
each point can find its current leader in 0(log w) time. 

In order to compute d[y] when y becomes pre-active, we augment the data 
structure described so far with a complete binary tree planted over each interval 
and maintain for each node in the tree the minimum d value over its leaf descend- 
ants. Decreases of d values are easily executed in O(logw) time, updating along 
the path from the relevant leaf to the root of its tree. When a segment of length 
r is split, we compute the minima over each of the new segments by following 
paths from a leaf to the root in the trees in which the segment begins and ends 
and inspecting the roots of all trees in between, which takes 0(logru r/w) 
time. Since there are at most m decreases and n — 1 segment splits and the total 
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length of all segments split is 0{nw), the total time comes to 0(m logic). This 
ends the proof of Theorem QJ 



2.3 Extensions 

If only 6 > 2 of the ic+l possible edge levels 0, ... ,w occur in an input network Af 
with n vertices and m nodes, we can solve SSSP problems in M in 0{n log log n+ 
TO log 6) time and 0{n + m) space. For this, the “search space” of the algorithm 
BatchedSearch should be taken to be the actual set of edge levels (plus, possibly, 
one additional level needed to ensure strong connectivity), and the activation of 
nodes in the component tree should use intervals of size b rather than w. This 
changes all bounds of 0{mlogw) in the analysis to 0{mlogb), and all other 
time bounds are 0 (to + nloglogn). This also takes care of the case w > m that 
was ignored in Section 12. 1 1 

As observed by Johnson [121) APSP problem in a strongly connected 
network Af with n vertices, to edges, edge lengths in {— 2*", . . . , 2’"}, and no 
negative cycles can be solved with an SSSP computation in Af and n — 1 SSSP 
computations in an auxiliary network Af' with n vertices, to edges, and edge 
lengths in {0, ...,n2“}. The SSSP computation in Af can be carried out in 
0{nm) time with the Bellman-Ford algorithm m The SSSP computations in 
AT can be performed with the new algorithm, but constructing the component 
tree for Af' only once. Disregarding the construction of the component tree and 
the activation of its nodes, the new algorithm works in 0 (to -|- nloglog n) time. 
The node activation can be done within the same time bound by appealing to a 
decrease-split-minimum data structure due to Gabow cni; this connection was 
noted by Thorup nn, who provides details. Since the component tree can be 
constructed in 0{nm) time, this proves the following theorem. 

Theorem 2. For all positive integers n, m and w with w > logn > 1, all-pairs 
shortest-paths problems in networks with n vertices, m edges, edge lengths in the 
range {—2’", . . . , 2’"}, and no negative cycles can be solved in 0{nm-\-n^ log logn) 
time and 0{n-\-m) space (not counting the output space) on a word RAM with 
a word length of w bits and the restricted instruction set. 



3 Shortest Paths in Undirected Networks 

When the algorithm of the previous section is applied to an undirected net- 
work, it is possible to eliminate the bottlenecks responsible for the superlinear 
running time. As argued by Thorup |2 1 j . the node activation can be done in 
0(n -|- to) overall time by combining the decrease-split-minimum data structure 
of Gabow m with the Q-heap of Fredman and Willard 0. The second bot- 
tleneck is the construction of the component tree, for which we propose a new 
algorithm. 
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3.1 The Component Tree for Undirected Networks 

In the interest of simplicity, we will assume that there are no edges of zero 
length. This is no restriction, as all connected components of Go can be replaced 
by single vertices in a preprocessing step that takes 0(n + m) time. 

We begin by describing a simple data structure that consists of a bit vector 
indexed by the integers 1, . . . , w and an array, also indexed by 1, . . . , w, of stacks 
of edges. The bit vector typically indicates which of the stacks are nonempty. In 
constant time, we can perform a push or pop on a given stack and update the 
bit vector accordingly. Using the MSB instruction, we can also determine the 
largest index, smaller than a given value, of a nonempty stack or, by keeping 
the reverse bit vector as well, the smallest index, larger than a given value, 
of a nonempty stack. In particular, treating the stacks simply as sets, we can 
implement what HI3 calls a neighbor dictionary for storing edges with their levels 
as keys that executes each operation in constant time. Since only the level of a 
key is relevant to the component tree, in the remainder of the section we will 
assume that the length of an edge is replaced by its level and denote the resulting 
network by AT. 

Following Thorup we begin by constructing a minimum spanning tree 
(MST) of AT. Instead of appealing to the MST algorithm of Fredman and 
Willard jOj, however, we simply use Prim’s MST algorithm jtill 3) . which main- 
tains a subtree T of Af', initially consisting of a single node, processes the edges 
in Af' one by one, and always chooses the next edge to process as a shortest 
edge with at least one endpoint in T. Aided by an instance of the dictionary 
described above, we can execute Prim’s algorithm to obtain an MST Tm of Af' 
in 0(m) time. We root Tm at an arbitrary node. The significance of Tm is that 
a component tree for Tm (with the original edge lengths) is also a component 
tree for Af . 

The next step is to perform a depth-first search of Tm with the aim of out- 
putting a list of the edges of Tm, divided into groups. Whenever the search 
retreats over an edge e = {u, u} of length I, with u the parent of v, we want, for 
f=l,...,Z — 1, to output as a new group those edges of length i that belong 
to the subtree of v, i.e., the maximal subtree of Tm rooted at v, and that were 
not output earlier. In addition, in order to output the last edges, we pretend 
that the search retreats from the root over an imaginary edge of length oo. In 
order to implement the procedure, we could use an initially empty instance of 
the dictionary described in the beginning of the section and, in the situation 
above, push e on the stack of index I after popping each of the stacks of index 
1, ... ,l — 1 down to the level that it had when e was explored in the forward 
direction (in order to determine this, simply number the edges in the order in 
which they are encountered). Because of the effort involved in skipping stacks 
that, although nonempty, do not contain any edges sufficiently recent to be out- 
put, however, this would not work in linear time. In order to remedy this, we 
stay with a single array of stacks, but associate a bit vector with each node in 
Tm. When the search explores an edge {it, u} of length I in the forward direction, 
with u the parent of v, the dictionary of v is initialized to all-zero (denoting an 
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empty set), and when the search retreats over {m, w}, the bit of index I is set in 
the bit vector of u, which is subsequently replaced by the bitwise OR of itself and 
the bit vector of v. Thus the final bit vector of v describes the part of the array 
of stacks more recent than the forward exploration of {u, w}, so that, when the 
search retreats over {u,v}, the relevant edge groups can be output in constant 
time plus time proportional to the size of the output. Overall, the depth- first 
search takes 0{n) time. 

Define the length of a group output by the previous step as the common 
length of all edges in the group (their former level). We number the groups 
consecutively and create a node of level equal to the group length for each group 
and a node of level —1 for each vertex in V . Moreover, each group output when 
the search retreats over an edge e, except for the longest group output at the 
root, computes its parent group as the shortest group longer than itself among 
the group containing e and the groups output when the search retreats over e, 
and each vertex v € V computes its parent group as a shortest group containing 
an edge incident on v. This constructs a component tree for Af in 0{n + m) time. 



References 

1. R. K. Ahuja, T. L. Magnanti, and J. B. Orlin, Network Flows: Theory, Algorithms, 
and Applications, Prentice-Hall, Englewood Cliffs, NJ, 1993. 

2. R. K. Ahuja, K. Mehlhorn, J. B. Orlin, and R. E. Tarjan, Faster algorithms for the 
shortest path problem, J. ACM 37 (1990), pp. 213-223. 

3. R. Bellman, On a routing problem. Quart. Appl. Math. 16 (1958), pp. 87-90. 

4. B. V. Cherkassky, A. V. Goldberg, and C. Silverstein, Buckets, heaps, lists, and 
monotone priority queues, SIAM J. Comput. 28 (1999), pp. 1326-1346. 

5. G. B. Dantzig, On the shortest route through a network. Management Sci. 6 (1960), 
pp. 187-190. 

6. E. W. Dijkstra, A note on two problems in connexion with graphs, Numer. Math. 
1 (1959), pp. 269-271. 

7. L. R. Ford, Jr. and D. R. Fulkerson, Flows in Networks, Princeton University Press, 
Princeton, NJ, 1962. 

8. M. L. Fredman and R. E. Tarjan, Fibonacci heaps and their uses in improved 
network optimization algorithms, J. ACM 34 (1987), pp. 596-615. 

9. M. L. Fredman and D. E. Willard, Trans-dichotomous algorithms for minimum 
spanning trees and shortest paths, J. Comput. System Sci. 48 (1994), pp. 533- 
551. 

10. H. N. Gabow, A scaling algorithm for weighted matching on general graphs, in 
Proc. 26th Annual IEEE Symposium on Foundations of Computer Science (FOCS 
1985), pp. 90-100. 

11. T. Hagerup, Sorting and searching on the word RAM, in Proc. 15th Annual Sym- 
posium on Theoretical Aspects of Computer Science (STACS 1998), Lecture Notes 
in Computer Science, Vol. 1373, Springer, Berlin, pp. 366-398. 

12. D. B. Johnson, Efficient algorithms for shortest paths in sparse networks, J. ACM 
24 (1977), pp. 1-13. 

13. R. C. Prim, Shortest connection networks and some generalizations. Bell Syst. 
Tech. J. 36 (1957), pp. 1389-1401. 




72 



T. Hagerup 



14. R. Raman, Priority queues: Small, monotone and trans-dichotomous, in Proc. 4th 
Annual European Symposium on Algorithms (ESA 1996), Lecture Notes in Com- 
puter Science, Vol. 1136, Springer, Berlin, pp. 12H37. 

15. R. Raman, Recent results on the single-source shortest paths problem, SIGACT 
News 28:2 (1997), pp. 81-87. 

16. R. Raman, Priority queue reductions for the shortest-path problem, in Proc. 10th 
Australasian Workshop on Combinatorial Algorithms (AWOCA 1999), Curtin Uni- 
versity Press, pp. 44-53. 

17. R. E. Tarian, Efficiency of a good but not linear set union algorithm, J. ACM 22, 
(1975), pp. 215-225. 

18. M. Thorup, On RAM priority queues, in Proc. 7th Annual ACM-SIAM Symposium 
on Discrete Algorithms (SODA 1996), pp. 59-67. 

19. M. Thorup, Randomized sorting in O(nloglogn) time and linear space using ad- 
dition, shift, and bit-wise boolean operations, in Proc. 8th Annual ACM-SIAM 
Symposium on Discrete Algorithms (SODA 1997), pp. 352-359. 

20. M. Thorup, Faster deterministic sorting and priority queues in linear space, Proc. 
9th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 1998), pp. 
550-555. 

21. M. Thorup, Undirected single-source shortest paths with positive integer weights 
in linear time, J. ACM 46 (1999), pp. 362-394. 

22. P. van Emde Boas, Preserving order in a forest in less than logarithmic time and 
linear space. Inform. Process. Lett. 6 (1977), pp. 80-82. 

23. P. van Emde Boas, R. Kaas, and E. Zijlstra, Design and implementation of an 
efficient priority queue. Math. Syst. Theory 10 (1977), pp. 99-127. 

24. P. D. Whiting and J. A. Hillier, A method for finding the shortest route through a 
road network, Oper. Res. Quart. 11 (1960), pp. 37-40. 




Improved Algorithms 

for Finding Level Ancestors in Dynamic Trees 



Stephen Alstrup and Jacob Holm 

The IT University of Copenhagen, 
Glentevej 67, DK-2400, Denmark. 

{Stephen, jholm}@itu. dk 



Abstract. Given a node x at depth d in a rooted tree Lev el Ancestor {x, i) 
returns the ancestor to x in depth d — i. We show how to maintain 
a tree under addition of new leaves so that updates and level ances- 
tor queries are being performed in worst case constant time. Given a 
forest of trees with n nodes where edges can be added, m queries and 
updates take 0(ma(m,n)) time. This solves two open problems (P.F. 
Dietz, Finding level-ancestors in dynamic trees, LNCS, 519:32-40, 1991). 
In a tree with node weights, min(x, y) report the node with minimum 
weight on the path between the nodes x and y. We can substitute the 
LevelAncestor query with min, without increasing the complexity for 
updates and queries. Previously such results have been known only for 
special cases (e.g. R.E. Tarjan. Applications of path compression on bal- 
anced trees. J.ACM, 26(4):690-715, 1979). 



1 Introduction 

Given a collection of rooted trees and a node x in depth d, Level Ancestor {x,i) 
returns the ancestor to x in depth d—i. We give a simple algorithm to preprocess 
a tree in linear time so that queries can be answered in worst case constant time. 
New leaves can be added to the tree (AddLeaf), so that each update and query 
take worst case constant time. For a forest of trees with n nodes where new edges 
may be added (Link), updates have amortized complexity 0{a{l, n)) and queries 
have worst case complexity 0{l) time, where a is the row inverse Ackerman 
function and Z > 0 is an arbitrary integer. This matchs a RAM lower bound P, 
for word size 0(logn). The results are presented in a self-contained manner, i.e., 
the results use classic techniques, but do not depend on any non-trivial data 
structures. 

In II , Chazelle needs a specialized version of the problem: given two nodes x 
and y, return the child of y which is an ancestor to x. An identical problem arises 
in range query problems jEj. In [El, Harel and Tarjan give a quite involved 
optimal algorithm for the level ancestor problems for a special kind of trees 
with maximum height O(logn). The solution is used to find nearest common 
ancestors of two nodes in a tree. In m, level ancestors queries are used to 
recognize breadth-first trees, and Dietz nm studies the problem in connection 
with persistent data structures. 
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In [7], Berkman and Vishkin show how to preprocess a tree in linear time 
so that level ancestors can be answered on-line in constant time, however, in- 
troducing a constant 2^ . The solution in 0 is based on a reduction using an 
Euler tour in the tree. Dietz cni gave an amortized constant time per operation 
algorithm for a tree that grows under addition of leaves. Doing this he uses an 
approach different from (7J. Dietz partition the tree into heavy paths and uses 
a fast search structure for small sets m in a two level system. The fast search 
structure supports predecessor queries in constant time in a set with at most 
O(logn) keys. 

Our worst case optimal algorithm for adding leaves to a tree, and our optimal 
result for maintaining a forest where new edges can be inserted, solve two open 
problems stated by Dietz in HD). The results that we present apply another 
approach than Berkman, Vishkin and Dietz pna, and does not use a fast search 
structure or introduce large constants. For a static tree and a tree which can 
grow under the addition of leaves, we present a new shortcutting technique 
and combine it with a version of Gabow and Tarjan ’s m micro tree technique 
similar to that it To maintain a forest under insertion of new edges, we use 
Gabow’s a-scheme j 1 41 1 0) . Doing this in a direct way introduce a more general 
level ancestor problem where edges have associated weights. We show how to 
solve this problem, still keeping the constants low. 



1.1 Variants and Extensions 

Let the nodes in the tree have weights associated. We can substitute the 
LevelAncestor query with the queries below, without increasing the complexity 
for updates and queries: 

— min{x, y) : return the node with minimum weight on the path x ■ ■ - y. 

— succ(x, y, d) : return the first node z on the path from x to y where 
dist{x, z) > d. Here dist is the sum of weights on the path between the two 
nodes. 

For the succ operation the weights must be polylogarithmically bounded in- 
tegers, i.e., weight{v) — 0((logn)°) for a constant c, to achieve the claimed 
bounds. For larger weights we are dealing with the classic predecessor prob- 
lem |5|. To achieve the claimed bounds for min, we need the ability to deter- 
mine the rank of a weight in a small dynamic set. To do this we use the results 
from ^3|- 

In jhfdfi] . Ghazelle and Thorup (in a parallel computation model) show how 
to insert m shortcut edges in a tree so that given any pair of nodes, a path 
(using shortcut edges) of length at most 0{a{m,n)) can be reported in a time 
linear in the length of the path. Their results are optimal by a result of Yao |T?nj . 
For each shortcut edge (a, 6) inserted, two (shortcut) edges (a,c), (c, d) already 
exist. Their technique only applies to static trees, i.e., addition of new leafs is 
not allowed. Using the shortcutting technique, min queries can be answered in 
0{a{n,n)) time after linear time preprocessing. The problem of determining 
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wheter a spanning tree is a minimum spanning tree for a graph, can be reduced 
to the min query for a static tree. Using the shortcutting technique, an almost 
linear time algorithm is achieved. As a direct application of the classic Union- 
Find technique for disjoint sets, Tarjan m shows how to answer queries like 
min for a forest, which can be updated under edge insertions. However, queries 
are restricted to include the root of the tree, i.e. finding the minimum weight 
from a node to the root. Linear time minimum spanning tree verification algo- 
rithms [Nil take advantage of the fact that all queries are known in advance. 
This, combined with micro tree techniques HE! and m, gives a linear time al- 
gorithm. In PI it was raised as an open question how to preprocess a tree in 
linear time so that min queries could be answered in constant time. Finding the 
minimum weight on just a path for a restricted domain (1 • • • n) is sufficient to 
find nearest common ancestor (NCA) in a tree (see [IZ])- Two optimal (almost) 
identical algorithms for NCA in a static tree are given in j I . whereas ^ 
solves the min path problem. In HSI, Harel considers min queries in a forest 
where new edges can be inserted as in PI- He gives a linear time algorithm for 
the case where all edge insertions are known in advance and the weights can be 
sorted in linear time. 

Summarizing : The static techniques for shortcutting are more general than 
our techniques, but use optimal 0{a{n, n)) time for each query compared to our 
constant time complexity. Our techniques can handle addition of leaves in worst 
case constant time. For the more general link command we support queries 
between any pair of nodes in the tree, opposite to Tarjan, where one of the 
nodes should be the root, thus somehow restricting the technique to off-line 
problems. Furthermore, our techniques support the succ command as opposed 
to both Tarjan’s and the static shortcutting techniques. The succ command 
is used for LevelAncestor-Vike problems, and in the more general cases also 
for successor/predecessor queries in a tree. In order to answer min queries in 
constant time after linear time preprocessing we need non-comparison based 
techniques (El. 



1.2 Fully Dynamic Trees 



In [2d| . Sleator and Tarjan give an algorithm to maintain a forest of trees under 
insertion and deletion of new edges supporting min queries. Each operation is 
supported in worst case O(logn) per operation. As trivial applications of top 
trees m Level Ancestor and succ can also be supported in O(logn) worst case 
time per operation. On a RAM with word size 0(log n), we have the usual gap to 
the 12(logn/loglogn) lower bound f2] for fully dynamic tree problems. For the 
results achieved in 1231 and the applications of top trees | 2 | there is no restriction 
on the node weights. Both algorithms are pointer algorithms 123 ] and optimal 
for this model, since we for the static case have a trivial lower bound I2(log h) 
for queries in a tree with height h. A matching pointer algorithm upper bound 
for the static problem is given by Tsakalidis and Van Leeuwen 
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1.3 Outline 

In section |21 we give a linear time algorithm to preprocess a tree so that level 
ancestor queries can be answered in worst case constant time. Using the tech- 
niques from section 13 we in section 0show how to support AddLeaf. Given this 
result in details, we sketch (because of lack of space) the remaining results in 
section 0 

1.4 Preliminaries 

Let T be a rooted tree with root node r = root{T). The parent of a node in T is 
denoted by parent{v). The depth of v is d{v) = d{parent{v)) + 1, and d{r) = 0. 
The nodes on the path from to r are ancestors of v, and the nodes on the 
path from parent{v) to r are proper ancestors of v. If w is an ancestor of v, then 
r; is a descendant of w. Level Ancestor (x,i) returns the ancestor y to x where 
d{x) — d{y) = i. If such an ancestor does not exists the root r is returned (in this 
paper we ignore this trivial case). The subtree rooted at v is the tree induced 
by all descendants of v. The size of the subtree rooted at v is denoted s{v). The 
length of a path between two nodes v, w is denoted dist{v, w) and is the number 
of nodes on the unique simple path between v and w. If the nodes/edges have 
weights, we let dist(y,w) denote the sum of weights on the path. The notation 
d\a means that a = kd for an integer k. If such a k does not exist, we write d /fa. 
We let log X denote log 2 x. For the row inverse Ackerman function a and the 
(functional) inverse Ackerman function a, we use the standard definition 1151 . 
i.e., a(l,n) = O(logn), a{2,n) = 0(log*n), etc. 

2 Static Level- Ancestor 

In this section we will show how to preprocess a tree in linear time and space, so 
that level ancestor queries can be answered on-line in worst case constant time. 
We do this by first introducing a simple algorithm which uses 0(n log n) time 
and space for preprocessing of a tree with n nodes, and then using micro trees 
to reduce the time and space to linear. 

2.1 Macro Algorithm 

We define the rank of v, denoted r(v), to be the maximum integer i so that 
2® I d{v) and s{v) > 2*. Note that with this definition the rank of the root is 
flog 2 n\ . 

Observation 1. The number of nodes with rank > i is at most [n./2'J . 

The preprocessing algorithm consists of precomputing the depth, size and 
rank of each node in the tree and then constructing the following two tables: 

levelcuicft)] [a;]: contains the x’th ancestor to v, for 0 < a; < 
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jump[?;] [i]: contains the first proper ancestor to v whose depth is divisible by 2*, 
for 0 < i < log 2 (d(t^) + 1). 

The idea behind the algorithm is to use the jump[][] table a constant number of 
times in order to reach a node w with a sufficiently large rank to hold the answer 
to the level ancestor query in its table. The complete pseudocode for answering 
level ancestor queries looks as follows: 

LevelAncestor(u, x): 

i := [log2(a;+ 1)J 

d := d{v) — X 

while 2{d{v) — d) >2’“ j j max 4 times! 

V := jump[u] — 1] 
return levelanc[u][d(u) — d] 



Lemma 2. A tree with n nodes ean be preproeessed in 0{n log n) time and spaee, 
allowing level aneestor queries to be answered in worst ease constant time. 

Proof. The depth and size of each node can be computed by a simple top- 
down/bottom-up traversal of the tree. For any node v, r{v) can then be computed 
as max{j : 2* | d{v) A s(u) > 2*}, in O(logn) time. 

The levelanc[u][] table can for each v be constructed in 0(2''(")) time, by 
following parent pointers. The total time to build the levelEinc[][] table is there- 
fore linear in the number of entries. By observation ^ the number of entries is 
less than Eo<»<iog, ™ Lf J2* < ulog 2 n. 

For any v and any i, 0 < i < log 2 (d{v) + 1), we note that jump[u][i] = 
parent(u) if 2* | d(parent(u)), and jump[u][z] = jump[parent(u)][i] otherwise. Thus 
the jump[][] table can be computed by a simple top-down traversal of the tree, 
and like above this takes linear time in the number of entries, which is 0{n log n). 

Now we only need to show that a LevelAncestor(u, x) query is computed 
in worst case constant time. If a: = 0 or x = 1 this is trivial, so assume that 
X > 1. Let w be the ancestor of v with depth d = d{v) — x. Then w is the node 
we should return. Setting i := [log 2 (x-|- 1)J as in the algorithm, we have that 
among the x-l- 1 nodes on the path from v to w, there are between 2 and 4 nodes 
whose depth is divisible by 2*“^. Obviously the “while” loop finds the topmost 
of these in at most four steps, and then it stops. After the loop, d{v) is divisible 
by 2*“^, and since we know that v has descendants down to d{v) + 2®“^, we must 
also have s{v) > 2®“^. Thus r{v) > i — 1 and the levelanc[u][] table has length 
> 2*“^. But since d{v) — d < 2*“^, we can now find w as levelanc[u][(i(u) — d]. 
Thus, as desired, we have found w in at most a constant number of steps. 



2.2 Hybrid Algorithm 

In this section we will reduce the time and space complexity of our preprocessing 
algorithm to linear by applying the following lemma, proved in the next section. 
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Lemma 3. Given 0{n) time and space for pre-preprocessing, we can preprocess 
any tree T with at most | log 2 n nodes in 0(|T|) time, allowing level ancestor 
queries to be answered in worst case constant time. 

Let T be a tree with n > 4 nodes, let tq = [log 2 log 2 n — IJ and let M = 2’’“. 
Now ro and M are integers and | log 2 n < M < ^ log 2 n. We define the set 
of macro nodes in T to be the set of all nodes r; in T with r(v) > tq. By 
observation [0 there are at most n/2’'° = n/M macro nodes in T. 

Observation 4. Let v be a node with s{v) > M , then there is a macro node 
among the first M ancestors to v. 

To enable finding the first macro node that is a proper ancestor to any node, we 
introduce the following table: 

jumpM[ti]: contains the first proper ancestor of v whose depth is divisible by M . 

The first proper macro node ancestor to a node v is then either jumpM[w] or 
jumpM[jumpM[r)]]. (The reason for not simply letting jumpM[ti] point directly to 
the node is to simplify section E2I). 

Let T/M denote the macro tree induced by the macro nodes of T. Since 
M = f?(logn), we can use the algorithm from the previous section to preprocess 
T/M in 0{n/Mlog{n/M)) = 0{n) time and space to answer level ancestor 
queries in constant time. The distance in T between any macro node and its 
parent in the macro tree T /M is exactly M ; thus for any macro node v we can 
in constant time find the macro node of least depth on the path in T from v to 
LevelAncestor(u, a;) by simply computing LevelAncestor 7 '/M(u, L;^J)- (Even the 
division takes constant time, since M = 2’’“ is a power of 2). The distance from 
this node to LevelAncestor(u, x) is less than M. 

If there is a macro node on the path from v to w =LevelAncestor(u, a;), we 
therefore only need to find the first macro node that is ancestor to v and use the 
algorithm for the macro tree to find a node with no macro nodes on the path 
to w. The above discussion shows that this can be done in worst case constant 
time. Thus all that remains to be shown is how to handle level ancestor queries 
where there are no macro nodes on the path. The idea now is to partition T into 
micro trees, in scuh a way that we can find any of the ancestors to a node v up 
to the first macro node by looking in at most two micro trees. Specifically we 
partition T into micro trees of size < M, so that if |^| < A/ for some micro tree 
pL, then all descendants of root(/r) are in fi. The partition can be done in linear 
time using one top-down traversal of the tree. A micro tree in this partition is 
called full, if it has exactly M nodes. For any node v, let pi{v) denote the micro 
tree containing v, and let pLp{v) denote the micro tree /j,(parent(root(/r(u)))). 
From the definition of the partition it follows that fj,p{v) is a full tree, unless 
root(T) G /i(u) (in which case p.p{v) is undefined). 

For each micro tree p, create a table levelancM)^] [] containing the first |^| 
ancestors to root(/r). Since the micro trees form a partition of T, this table has 
exactly one entry for each node in T and thus size 0{n). By observation El the 
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levelcincM[] [] table for a full micro tree contains a macro node. It follows that 
each ancestor of a node v up to the first macro node is contained in either 
Hp{v) or levelancM[/ip(r;)][] as desired. By lemma 0 level ancestor queries can 
now be answered in worst case constant time as follows. 

LevelAncestor(r;, x): 

d := d{v) — X 
w := jumpM[ri] 
if w is not a macro node 
w := jumpM[z(;] 

// Now w is the first macro node on the path from v to root(ri). 
if d{w) > d then 

V := LevelAncestor7'/A^(te, \ ) 

II Now there are no macro nodes on the path from v. 
if d(root(^(?;))) < d then 

return LevelAncestor^(^)(ti, d(w) — d) 

V := parent(root(/i(n))) 

// fj,(v) is now a full micro tree, 
if d{root{fi{v))) < d then 

return LevelAncestor^(^)('y, d(u) — d) 
return levelancM[/r('i;)][d(root(/x(z;))) — d] 

We have thus proven the following: 

Theorem 5. A tree can be preprocessed in linear time and space, allowing level 
ancestor queries to be answered in worst case constant time. 

2.3 Micro Algorithm 

In this section we will show how to construct a set of tables in 0(n) time, so 
that we can preprocess any tree with at most N = \_^ log 2 nj nodes, allowing 
level ancestor queries to be answered in worst case constant time. 

Let fi be the micro tree we want to preprocess. We number all the nodes of 
/r in a top-down order and create a table nodetable [/r] [f] containing the nodes 
in fi with number i for each 0 < i < |/i|. To represent the ancestor relation 
efficiently, we use a single word anc[?;] for each node v, where bit i is set if and 
only if the node numbered i in /r is an ancestor of v. To find the xth. ancestor 
to V we now only need to find the index i of the xth. most significant bit that 
is set in anc[r;] and then return nodetable [/i] [i]. In order for this query to take 
worst case constant time, we construct the following table which is completely 
independent of p,\ 

bitindex[u'] [i]: contains the position of the ith most significant set bit in w, 
for 0 < w < 2^ and 0 < i < A. If only fc < i -|- 1 bits of w are set, 
bitindex[r(;] [i] = k — (i + 1). 

Given these tables, each level ancestor query is answered in worst case constant 
time as follows: 
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LevelAncestor^(?;, a:): 
i := bitindex[anc[ti]] [x] 
if i > —1 return nodetable[/r(u)][i] 



We are now ready to prove lemma El 

Lemma. Given 0(ji) time and space for pre-preprocessing, we can preprocess 
any tree T with at most | log 2 n nodes in 0{\T\) time, allowing level ancestor 
queries to be answered in worst case constant time. 

Proof. First, we must show how to create the bitindex[][] table. This can easily 
be done, as bitindex[2^ + fc][0] = fc, and bitindex[2-l + fc][i] =bitindex[fc][i — 1] 
for 0 < k < 2^ < 2^ and 0 < i < N. This means that we can start by 
setting bitindex[0] [z] := — (z + 1) for 0 < z < iV, and then keep doubling the 
table size until we are done. This takes linear time in the number of entries, 
which is N2^ < | log 2 n2^ *°S 2 n = log 2 n = 0{n). This concludes the pre- 
preprocessing. 

To preprocess each micro tree /i with at most N nodes, we do as follows: 
Traverse pL in any top-down order. Let z be the number of the node v in the 
traversal. We set nodetable[/x] [z] := v and if v is the root of p. we set cinc[z!] := 2*; 
otherwise we set anc[z;] := anc [parent (u)] -I- 2L 

3 Level Ancestor with AddLeaf 

In this section we will extend the static level ancestor algorithm from the previous 
section to support the AddLeaf operation in worst case constant time. We do 
this in three steps: First we show that the macro algorithm can be extended to 
support AddLeaf operations in worst case logarithmic time. Second, we show 
that the selection of macro nodes in the hybrid algorithm can be done in worst 
case constant time per AddLeaf . Finally we show that the micro algorithm can 
be modified to allow AddLeaf operations in worst case constant time. 

For both the micro- and macro algorithms we implicitly use the well-known 
result that we can maintain a dynamically sized array supporting each of the 
operations IncreaseLength (adding one to the length of the array) and LookUp 
(returning the z’th element in the array) in worst case constant time. 

3.1 Macro Algorithm 

When a new leaf v is added to the tree T, we must make sure that the tables 
levelcuic[][] and jump[][] are updated correctly. For the node v jmnp[z;][] can 
be computed exactly as in the static case in O(logrz) time and levelauic[z;] [] 
have only one entry levelEinc[z;] [0] = v. The addition of v does not influence 
the jump[][] tables for any other node in the tree, but it may cause the rank of 
[log 2 zzj of its ancestors to increase by one, which means that the levelanc[][] 
tables for these ancestors should be doubled. 
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If we just doubled the levelanc[][] tables as necessary, we would get an 
algorithm with amortized complexity O(logn) per AddLeaf. Instead, when we 
add a new leaf v to the structure, we extend the levelanc[ri;][] table for each 
ancestor w to whose rank v may contribute. The rank of w can only be increased 
when the number of nodes below w has doubled since the last increase; thus when 
the rank is increased, the table already has the correct length. The new node 
V has at most [log 2 (d(u) + 1)J ancestors whose rank it may contribute to, and 
these are exactly the nodes in the jump[z;][] table. Thus, when we add u as a 
new leaf, only a logarithmic number of tables has to be extended. Using the 
standard table-extending technique mentioned earlier, this can be done in worst 
case logarithmic time. 

The following macro algorithm for AddLeaf therefore runs in worst case 
logarithmic time: 

AddLeaf {v,p): 

levelanc[u][0] := v 

for i := 0 to log 2 (d(u) -I- 1) — 1 do 

if 2® I d{p) then w := p else w := jump[p][*] 
jump[u][f] := w 

extend the levelcinc[z(;] [] table with one more ancestor to w. 



As a final remark, we note that this algorithm can be run incrementally, divid- 
ing each AddLeaf into O(logn) separate steps (one for each iteration of the 
for-loop). The AddLeafSteps for different nodes can be mixed arbitrarily (how- 
ever new leaves can only be added below fully inserted nodes) together with 
LevelAncestor queries concerning fully inserted nodes. Each AddLeafStep and 
LevelAncestor query still runs in worst case constant time, and this will be im- 
portant in the next section. 



3.2 Hybrid Algorithm 

In order to extend the hybrid algorithm from section 12. 2 1 to allow AddLeaf 
operations, we must show how to maintain both the jumpM[] and levelancM[] [] 
tables. Adding a leaf to the tree does not change jumpM[ri;] for any node w, so we 
only have to compute jumpM[u] for the new leaf v, and this takes constant time. 

To maintain the levelcincM[] [] table when adding a leaf v with parent p in 
the tree, we have two cases. 

1. If |/i(p)| < M we must add v as a leaf in p{p) and extend the levelancM[/i(p)][] 
table. 

2. Otherwise create a new micro tree p(v) and set levelancM[/r(u)] [0] := v. 

The rest of the work is done either by the micro algorithm AddLeaf ^ de- 
scribed in the next section or by the AddLealBtep algorithm from the previous 
section. If each runs in worst case constant time, we can combine them into the 
following hybrid algorithm running in worst case constant time: 
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AddLeaf{v,p): 

AddLeaf f^{v,p) 

if 2’’“ I d{p) then jumpM[w] := p else jumpM[ti] := jumpM[p] 
Add-LealBtep ( j umpM [z;] , j umpM [ j umpM [z;] ] ) 



Here it is assumed that it takes at most M AddLeafSteps to fully insert a 
node in the macro tree. The algorithm assumes that we know the total number 
of nodes and thus ro and M ahead of time. However, using standard doubling 
techniques, we can easily extend the algorithm to handle the case where the total 
number of nodes is not known in advance. 

Theorem 6. We can maintain a rooted tree supporting AddLeaf and Leve- 
lAncestor operations in worst case constant time. 

3.3 Micro Algorithm 

When adding a leaf to a micro tree we only need to show how to update the 
nodetable[] [] and anc[] tables, since the bitindex[][] table depends only on the 
size of M, and can be handled using standard doubling techniques. The full 
micro algorithm for AddLeaf looks as follows: 

AddLeaf fj,(v,p): 
k := \fj.\ 

extend nodetable[/r][] with v. 
anc[ti] := anc[p] + 2^ 



4 Link and Querie Variants 

In this section we sketch how to support insertion of new edges in a forest of 
rooted trees supporting level ancestor, min and succ queries. First we focus on 
level ancestor. Let r be the root of a tree T and v a new node. The operation 
AddRoot(v , r) inserts a new edge between the node v and r, making v the root of 
the combined tree. Hence, the depth of all nodes in T increases by 1, and d{r) = 1 
after the operation. Let A be the algorithm given in the last section supporting 
addition of new leaves in a tree in constant time. It is simple to extend A, so 
that the operation AddRoot{v, r) is supported in worst case constant time, using 
an extendable array for all nodes added as roots to the tree. In general we have: 

Theorem 7. We can maintain a dynamic forest of rooted trees supporting Add- 
Leaf, AddRoot and LevelAncestor operations in worst case constant time. 

In [ 1 41 1 .'ij Gabow gives an a-scheme to handle insertion of new edges in a 
forest of rooted trees for connectivity-like queries. In order to use the a-scheme 
one should provide an algorithm which handles AddLeaf and AddRoot oper- 
ations. Given such an algorithm with constant time worst case complexity for 
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updates and queries, the a-scheme (if it can be applied) gives an algorithm with 
amortised complexity 0{a{l,n)) for updates and worst case complexity 0{l) for 
queries. The approach is to essentially insert shortcut edges (corresponding to 
path compression in Union-Find algorithms) in the tree. For connectivity this 
is straightforward, since the goal is to report the root of the tree. For the level 
ancestor problem, the shortcut edges introduce edges with weights in the tree. 
The edge weights change the problem to the problem of answering succ queries. 
However, by applying Gabow’s technique carefully, it is possible to limit the edge 
weights to 0((log n)^). In general we can handle edge weights of size 0{polylogn) 
using micro tree techniques. The basic idea is to use one level of micro trees to 
reduce the edge weights from 0(log2 n) to 0(log2~^ n). Doing this k times only 
increases space and time by a factor k. 

Theorem 8. For any I > 0, we can maintain a forest with n nodes, supporting 
Link in 0{a{l,n)) amortized time and LevelAncestor in 0{l) worst case time. 

Using the edge weights reduction technique, the succ operation can be sup- 
ported in a static tree and in dynamic forest in the same time as LevelAncestor, 
if the edge weights are polylogarithmically bounded positive integers. 

In order to answer min queries we use the following observation from the level 
ancestor algorithm: Essentially the level ancestor algorithm consists of shortcut- 
ting edges (from a jump table) and special treatment of micro trees. When 
constructing a shortcutting edge we can in the same time associate the mini- 
mum weight the shortcut edge covers reducing the problem to micro trees. In 
order to use the micro tree techniques presented in this paper for min queries, 
we need to know the rank for each weight in a micro tree. Since a micro tree has 
at most O(logn) edges, we can use fast search structures [El to find the rank of 
any weight in constant time. Thus, we conclude, the min operation can be sup- 
ported in a static tree and dynamic forest in the same time as LevelAncestor, 
if non-comparison techniques are allowed. 
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Abstract. Lax logical relations are a categorical generalisation of logi- 
cal relations; though they preserve product types, they need not preserve 
exponential types. But, like logical relations, they are preserved by the 
meanings of all lambda-calculus terms. We show that lax logical relations 
coincide with the correspondences of Schoett, the algebraic relations of 
Mitchell and the pre-logical relations of Honsell and Sannella on Henkin 
models, but also generalise naturally to models in cartesian closed cate- 
gories and to richer languages. 



1 Introduction 

Logical relations and various generalisations are used extensively in the study of 
typed lambda calculi, and have many applications, including 

• characterising lambda definability |Pl73|PI^ITTTni[^T^ : 

• relating denotational semantic definitions 

• characterising parametric polymorphism 

• modelling abstract interpretation irai; 

• verifying data representations |Mi91j : 

• defining fully abstract semantics |OH9.'r| : and 

• modelling local state in higher-order languages |OTH5lEt^ . 

The two key properties of logical relations are 

1. the so-called Basic Lemma: a logical relation is preserved by the meaning of 
every lambda term; and 

2. inductive definition: the type-indexed family of relations is determined by 
the base-type components. 
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It has long been known that there are type-indexed families of conventional 
relations that satisfy the Basic Lemma but are not determined inductively in a 
straightforward way. Schoett |Sc87[ uses families of relations that are preserved 
by algebraic operations to treat behavioural inclusion and equivalence of alge- 
braic data types; he terms them “correspondences,” but they have also been 
called “simulations” frrrn and “weak homomorphisms” iniHi- Furthermore, 
Schoett conjectures (pages 280-81) that the Basic Lemma will hold when ap- 
propriate correspondences are used between models of lambda calculi, and that 
such relations compose. Mitchell Sect. 3.6.2] terms them “algebraic rela- 

tions,” attributing the suggestion to Gordon Plotkiifl and Samson Abramsky, 
independently, and asserts that the Basic Lemma is easily proved and (binary) al- 
gebraic relations compose. But Mitchell concludes that, because logical relations 
are easily constructed by induction on types, they “seem to be the important 
special case for proving properties of typed lambda calculi.” 

Recently, Honsell and Sannella have shown that such relation families, 

which they term “pre-logical relations,” are both the largest class of conventional 
relations on Henkin models that satisfy the Basic Lemma, and the smallest class 
that both includes logical relations and is closed under composition. They give 
a number of examples and applications, and study their closure properties. 

We briefly sketch two of their applications. 

• The composite of (binary) logical relations need not be logical. It is an easy 
exercise to construct a counter-example; see, for instance, PHnni- But the 
composite of binary pre-logical relations is a pre-logical relation. 

• Mitchell flVIim] showed that the use of logical relations to verify data rep- 
resentations in typed lambda calculi is complete, provided that all of the 
primitive functions are first-order. In [HSDDj . this is strengthened to allow 
for higher-order primitives by generalising to pre-logical relations. Honsell, 
Longley et al. ra give an example in which a pre-logical relation is used 
to justify the correctness of a data representation that cannot be justified 
using a conventional logical relation. 

In this work, we give a categorical characterisation of algebraic relations 
(simulations, correspondences) between Henkin models of typed lambda calculi. 
The key advantage of this characterisation is its generality. By using it, one can 
immediately generalise from Henkin models to models in categories very different 
from Set, and to languages very different from the simply typed lambda calculus, 
for example to languages with co-products or tensor products, or to imperative 
languages without higher-order constructs. 

The paper is organised as follows. In Sect.|21 we recall the definition of logical 
relation and a category theoretic formulation. In Sect.0 we give our categorical 
notion of lax logical relation, proving a Basic Lemma, with a converse. In Sect.0 
we explain the relationship with pre-logical relations and in Sect. 0 give another 
syntax-based characterisation. In Sect.0we consider models in cartesian closed 
categories. In Sect.Q, we generalise our analysis to richer languages. 

^ Plotkin recalls that the suggestion was made to him by Eugenio Moggi in a conver- 
sation. 
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2 Logical Relations 

Let 17 be a signature of basic types and constants for the simply typed A-calculus 
with products ICTiiin . generating a language L. We use a and r to range over 
types in L. We denote the set of functions from a set X to a set Y \yy [X ^ Y], 

Definition 2.1. A model M of L in Set consists of 

• for each a, a set M„, such that Ma-^r = [Ma Mr], M„xt = M^ x Mr, 
and Ml = {*}; 

• for each constant c of S of type a, an element M{c) of M„. 

A model extends inductively to send every judgement _T h t: cr of L to a function 
M{r h t'.a) from Mr to M^-, where Mp is the evident finite product in Set. 
These are “full” type hierarchies; larger classes of models, such as Henkin models 
and cartesian closed categories, will be discussed later. 

Definition 2.2. Given a signature E and two models, M and N , of the language 
L generated by S, a (binary) logical relation from M to N consists of, for each 
type a of L, a relation Ra C Ma- x Na such that 

• for all f S Ma-yr o,nd g G Na^r, we have f Ra^r g if o,nd only if for all 
X G Ma and y G Na, if x Ra y then f{x) Rr g{y); 

• for all (a;o,a;i) G Maxr and {yo,yi) G Naxr, we have {xq,Xi) Raxr {yo,yi) 
if and only if xq Ra yo and xi Rr yi ; 

• * Ri 

• M(c) Ra N(c) for every constant c in E of type a. 

The data for a binary logical relation are therefore completely determined by its 
behaviour on base types. The fundamental result about logical relations under- 
lying all of their applications is the following. 

Lemma 2.3 (Basic Lemma for Logical Relations). Let R be a binary log- 
ical relation from M to N; for any term t:a of L in context F, if xRry, then 
M{r \- t: a)x Ra N{F \- t: a)y, 

where x Rp y is an abbreviation for Xi Ra^ yi for all i where cti , . . . , cr„ is the 
sequence of types in F. It is routine to define n-ary logical relations for an 
arbitrary natural number n, in the spirit of Definition 12 .7’! The corresponding 
formulation of the Basic Lemma holds for arbitrary n too. 

We now outline a categorical formulation of logical relations iHnnniHsn2i; 
this will be relaxed slightly to yield our semantic characterisation of algebraic 
relations for typed lambda calculi with products. 

The language L determines a cartesian closed term category, which we also 
denote by L, such that a model M of the language L in any cartesian closed 
category such as Set extends uniquely to a cartesian closed functor from L 
to Set |Mi96l Sect. 7.2.6]; i.e., a functor that preserves products and exponentials 
strictly (not just up to isomorphism). We may therefore identify the notion of 
model of the language L in Set with that of a cartesian closed functor from L 
to Set. 
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Definition 2.4. The category Rel 2 is defined as follows: an object (X,R,Y) 
consists of a pair of sets X and Y, and a binary relation R C X x Y; a map 
from (X,R^Y) to {X\ R' ,Y') is a pair of functions {f'-X — >■ X',g:Y — >■ Y') 
such that xRy implies f{x)R' g{y): 





/ 


X - 


-P—^X' 


r\ 


\r' 


y - 


9 , y/ 



Composition is given by ordinary composition of functions. 



We denote the forgetful functors from Reh to Set sending (X,R,Y) to X or to 
Y by (5o and (5i, respectively. 

Proposition 2.5. i?e ?2 is cartesian closed, and the cartesian closed structure is 
strictly preserved by the functor (i5o,(5i): Rel 2 — > Set x Set. 

For example, (Xo,R,Yo) => (Xi,S,Y,) is ([Xq ^ X,],{R => 5), [Fq ^ n]) 
where f{R ^ S)g iff, for all x G Xq and y € Yq, xRy implies {fx)S{gy). 

These properties of Rel 2 , combined with the fact that L is freely generated by 
a signature for cartesian closed categories (i.e., is the generic model on a suitable 
sketch IKO+97l h are the key to understanding logical relations categorically, as 
shown by the following. 

Proposition 2.6. To give a binary logical relation from M to N is equivalent 
to giving a cartesian closed functor R: L ^ Rel 2 such that (So, Si)R = {M, N): 



L 




{M,N) 



Proof. Given a binary logical relation, one immediately has the object function 
of i?: L — >■ i?e? 2 - The equation {Sq,6i)R = (M,N) determines the behaviour of 
R on arrows. The fact that, for any term t of type cr in context T, the pair 
(^M {r h t: a) , N {r \~ t: a)) satisfies the condition making it an arrow in Rel2 
from Rr to follows from (and is equivalent to) the Basic Lemma. 

The converse construction is given by taking the object part of a cartesian 
closed functor R: L ^ Rel2. It is routine to verify that the two constructions are 
mutually inverse. ■ 

This situation generalises to categories other than Set and Rel2, the central 
point being that both categories are cartesian closed and is a cartesian 

closed functor. We outline the following important example which arises in do- 
main theory to deal with logical relations in the context of least fixed points. 
Let C be the category of w-cpos with T and continuous maps, and M be the 
class of admissible monos; then there is an evident cartesian closed functor from 
Sub2{C, M), the category of admissible binary relations between epos, to C xC. 
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A logical relation in this framework is then a cartesian closed functor from L to 
Sub2{C, M) (coherent with appropriate models of L in C). 

Obviously, one can define a category Rein of n-ary relations for an arbitrary 
natural number n; Propositions t^.5l and 12. til Efeneralise routinely to arbitrary n. 

3 Lax Logical Relations 

In this section, we generalise the categorical notion of logical relation to what 
we call a lax logical relation. 

Definition 3.1. Given a signature S and the language L generated hy S, and 
two models M and N of L in Set, a (binary) lax logical relation from M to 
N is a functor R: L ^ Rel2 that strictly preserves finite products and satisfies 
{So,Si)R={M,N). 

Note that exponentials are not necessarily preserved. Evidently, one can adapt 
this definition to one for n-ary lax logical relations for arbitrary n. 

The origin of our terminology is as follows. Any finite-product preserving 
functor R-.C^D between cartesian closed categories induces a family of lax 
maps App„, ^ \ Rfj^T — > [R^ => Rt], obtained by taking the Currying in D of 
the composites 

Rff^r ^ ^ ^ R^ 

where the first map is determined by preservation of finite products, and the 
second map is obtained by applying R to the evaluation map in C. This is an 
instance of (op)lax preservation of structure, specifically, exponential structure. 

The notion of Henkin model is closely related to this definition. A Henkin 
model of the simply typed A-calculus is a finite-product preserving functor from 
L to Set such that the induced lax maps are injective. This is a kind of lax model, 
but is not quite the same as giving a unary lax logical relation; nevertheless, it 
is a natural and useful generalisation of the notion of model we have used, and 
one to which our results routinely extend. 

The Basic Lemma for logical relations extends to lax logical relations; in fact, 
the lax logical relations can be characterised in terms of a Basic Lemma. 

Lemma 3.2 (Basic Lemma for Lax Logical Relations). Let M and N be 

models of L in Set. A family of relations Ra Q x for every type a of L 
determines a lax logical relation from M to N if and only if for every term t: a 
of L in context F, if x Rr y, then M{F \- t\ a)x R„ N{F h t: a)y, 

where x Rp y is an abbreviation for Xi Rg-. yi for all i when cti , . . . , cr„ is the 
sequence of types in F. 

Proof. For the forward (only-if) direction, suppose F has sequence of types 
CTi, . . . ,CT„. The expression T h t: cr is a map in L from tri x ••• x cr„ to cr, 
so R sends it to the unique map from i?crix---xcr„ to Ra- in i?e?2 that lifts the pair 
{M{FLt:a),N{FLt:a)): 
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M (ai X ■ ■ ■ X an) 



M(r h t-.a) 



M{a) 




N(r\-t:a) 




N[ai X ■ ■ ■ X a„) 



N{a) 



If Xi Vi for all i then (xi, . . . , Xn) Raix---xa„ {vu ■ • ■ ) Vn) because R preserves 
finite products, and so the result is now immediate, as x S M{a\ x • • • x cr„) = 
(xi, . . . , x„) S M(cti) X • • • X M{an) and similarly for y. 

For the converse, first taking to be a singleton, the condition uniquely 
determines maps R{R h t:a):R{R) — >■ R{a) in i?e? 2 , giving a graph morphism 
from L to Set such that (i5o, ^i)-R = (M, N). Such a graph morphism is trivially 
necessarily a functor. Taking T h f: cr to be h *: 1, where * is the unique 
constant of type 1, the condition yields * i?i *, so R preserves the terminal object. 
Taking T h t: cr to be a:ao,b:ai h {a,b):ao x ai yields that if x^Ra^yo and 
xiRaiVi, then (xq, Xi) i?o-oxcri {yo,yi)- And taking F \- t:a to he a:ao xcrih 
7Ti a: CTi for i = 0, 1 give the converse. So R preserves finite products. ■ 

We conclude this section by showing how lax logical relations can be used 
for the two applications of pS99j | previously discussed. 

Definition 3.3. If R is a type-indexed family of binary relations from M to N 
and S is a type-indexed family of binary relations from N to P, their composite 
R\ S is defined component-wise; i.e., (i? ; S)a- = Ra ; S^ 

where ; on the right-hand side denotes the conventional composition of binary 
relations. 

Proposition 3.4. If R is a binary lax logical relation from M to N and S is a 
binary lax logical relation from N to P, then R] S is a lax logical relation from 



Proof. We must show that if R: L — >■ Rel 2 and S:L ^ Rel 2 strictly preserve 
finite products, then so does R ; S. But (xo,Xi) {R ; 5')o-xt (2/0:2/i) if and only 
if there exists (^o.-^i) such that (xq, xi) iio-xr (zo, -^i) and (zq, zi) S^xt (yo,yi), 
and that is so if and only if xq {R ; S)a- yo and xi (i? ; S)r yi- The proof for a 



Various other closure properties (such as closure with respect to conjunction 
and universal and existential quantification) have been proved in [HS99] for pre- 
logical relations; the results in the following section show that lax logical relations 
also have these closure properties. 

Definition 3.5. Let M and N be models of L in Set, and OBS be a set of types; 
then M and N are said to be observationally equivalent with respect to OBS 
(written M =obs N) when, for all a G OBS and all closed t.t': a, M(t) = M(t') 
if and only if N{t) = N{t'). 



M to P. 



terminal object is trivial. 



Proposition 3.6. M =obs ISf if and only if there exists a lax logical relation 
from M to N which is one-to-one for every a G OBS . 
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Proof. For the forward direction, consider the family of relations R^r C x 
defined by aR^b if and only if there exists a closed term t: a such that M{t) = a 
and N{t) = b. This is one-to-one on observable types because of observational 
equivalence and a lax logical relation because R^xt = Ra x Rt- 

For the converse, suppose R^ C x determine a lax logical relation; 
if tj £ OBS then, for all closed t:a, M{t)RaN{t) by the Basic Lemma and 
M{t) = M{t') if and only if N{t) = N{t') because R^- is one-to-one. ■ 

4 Pre-logical Relations 

We can use the Basic Lemma of Sect. Eland the corresponding result of 
to see immediately that, for models as we defined them in Sect.O the notions 
of lax logical relation and pre-logical relation coincide. However, in this section 
we give a more direct exposition of the connection for a larger class of models. 
In ESHSl, the analysis is primarily in terms of the simply typed A-calculus 
without product types. But they mention the case of A-calculi with products 
and models that satisfy surjective pairing. Hence, consider models now to be 
functors M: L — >■ Set that strictly preserve finite products (but not necessarily 
exponentials); these include Henkin models. Everything we have said about lax 
logical relations extends routinely to this class of models. 

Definition 4.1. A pre-logical relation from M to N consists of, for each type 
a , a relation R,j C Mg- x N,j such that 

1. if xR„y and fR^^r9> then App^ .,.fxRrApp^ .,.gy, where maps App^ .,. 
are determined by finite-product preservation of M and N , respectively, as 
discussed in Sect. m 

2. M{c) Ra N{c) for every constant c of type a, where the constants are deemed 
to include 

• all constants in S, 

• *: 1 , 

• {ax t), 

• ttq : cr X r — >■ (7 and tti : tj x r — >■ r, and 

• all instances of combinators Sp^a.r- {p ^ a ^ t) ^ {p ^ a) ^ p ^ t 
and Ka.T- a ^ t ^ a . 

Theorem 4.2. A type-indexed family of relations Ra Q Ma x Na determines a 
lax logical relation from AI to N if and only if it is a pre-logical relation from 
M to N. 

Proof. For the second clause in the forward direction, treat all constants as maps 
in L with domain 1. For the first clause, note that Ra x Ra^r = Rax{a^r)^ so 
applying functoriality of R to the evaluation map ev. a x {a ^ t) — > r in L, 
the result follows immediately. 

For the converse, the second condition implies that, for all closed terms t, 
M{t) RN{t)', that fact, combined with the fact that every map in L is an un- 
Currying of a closed term, plus the first condition, imply that i? is a graph 
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morphism making {Sq,Si)R = (M,N), hence trivially a functor. Since *: 1 is a 
constant and M{*) = * = N{*), we have M{*) Ri N{*); so R preserves the ter- 
minal object. Since (— , — ) is a constant, it follows that if Xq Ra-o Vo and Xi Ra^ yi, 
then (a;o, a^i) i?CToxcri (yoi2/i)- The inverse holds because ttq and tti are maps in 
L. So R preserves finite products. ■ 

5 Another Syntax-Based Characterisation 

The key point in the pre-logical characterisation above is that every map in the 
category L is generated by the constants. In this section, we give an alternative 
syntax-based characterisation that generalizes more directly to other languages. 
For simplicity of exposition, we assume, as previously, that models preserve 
exponentials as well as products. 

Theorem 5.1. To give a lax logical relation from M to N is equivalent to giving, 
for each type a of L, a relation R„ C IVR x N„- such that 

1- if f R{axr)^p9, then Curry(f) Ra^r^pCurry(g) 

2. App App 

3. iffoRcr^rgo and fiRa^pgi, then {fo, fi) R^^(rxp){9o,9i) 

^0 RaXT—^a'^O and TT]^ '^1 

5. if f Ra^r 9 and f R^^p g' , then (/' • f)Ra^p{g' ■ g) 

6. idRa^aid 

7. X i?CT y if and only if x R\^a y 

8. M{c) Ra N{c) for every base term c in E of type a. 

We chose the conditions above because the first four conditions seem particularly 
natural from the perspective of the A-calculus, the following two, which are about 
substitution, are natural category theoretic conditions, the seventh is mundane, 
and the last evident; cf. the “categorical combinators” of issni. 

Proof. For the forward direction, the relations Ra are given by the object part of 
the functor. The conditions follow immediately from the fact of R being a functor, 
thereby having an action on all maps, and from the fact that it strictly preserves 
finite products. For instance, there is a map in L from (cr — >■ r) x (ct — >■ p) to 
(T — >■ (t X p), so that map is sent by i? to a map in Rel2, and R strictly preserves 
finite products, yielding the third condition. So using the definition of a map 
in i?e? 2 , and the facts that {So,5i)R = (M,N) and that M and N are strict 
structure preserving functors, we have the result. 

For the converse, the family of relations gives the object part of the functor 
R. Observe that the axioms imply 

• (a;o, a^i) Raxr {yo, Vi) T and only if xq R„ yo and Xi Rr yi 

• * i?i *, where * is the unique element of Mi = N\ = 1 

So, R strictly preserves finite products providing it forms a functor. The data 
for M and N and the desired coherence condition {5q,5i)R — (M,N) on the 
putative functor determine its behaviour on maps. It remains to check that the 
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image of every map in L actually lies in i?e? 2 - But the conditions inductively 
define the Currying of every map in L, so unCurrying by the fifth and seventh 
conditions, the result follows. It is routine to verify that these constructions are 
mutually inverse. ■ 

The result holds for the more general class of models we have discussed, but 
an exposition would be encumbered by numerous occurrences of App^^. It is 
routine to generalise Theorems 14.21 and oi to n-ary relations for arbitrary n. 



6 Models in Cartesian Closed Categories 



Cartesian closed categories are a more general class of models for typed lambda 
calculi. In this section, we consider a model to be a functor from L to a cartesian 
closed category, strictly preserving finite products and exponentials. 

To discuss “relations” in this context, we adopt the sub-scone approach de- 
scribed in Let C he & cartesian closed category, S 

be a finitely complete cartesian closed category, and G: (7 — >■ S' be a functor that 
preserves products (up to isomorphism). A typical example of a suitable func- 
tor G is hom(l, — ): G — >■ Set, the global-elements functor; other examples may 
be found in the references given above. Then these data determine a category 
G-Rel 2 of categorical (binary) relations on C as follows. 

Let i?e? 2 (S) be the category of binary relations on S with evident forgetful 
functor RehiS) — S x S; then pulling back along G x G determines a category 
G-Rel 2 and a forgetful functor to G x G. In detail, the objects of G-Rel 2 are 
triples (ao,s,ai) where oq and oi are objects of G and s is a sub-object of 
G(ao) X G(oi); the morphisms from (ao,s, ai) to (bo,t,bi) are triples {fo,q,fi) 
such that fi'.Oi — >■ bi in G, gidoms — )> domt in S, and the following diagram 
commutes: 

• ► ^ G{ao) X G(ai) 



q 

I 



G(/o) X G(/i) 
G(foo) X G(6i) 



Composition and identities are evident. The forgetful functors Sp. G-Rel 2 — >■ G 
for i = 0, 1 are defined by Si(ao, s, oi) = Ui and similarly for morphisms. 



Proposition 6.1. G-Rel 2 is a cartesian closed category and the cartesian closed 
structure is strictly preserved by {So, Si): G-Rel 2 — > G x G; furthermore, this 
functor is faithful. 



Definition 6.2. Given a signature S and the language L generated by S , two 
models M and N of L in a cartesian closed category G, and a category G-Rel 2 
of binary categorical relations on C , a (binary) lax logical relation from M to 
N is a functor R: L — > G-Rel 2 that satisfies {Sq,Si)R = {M,N) and strictly 
preserves finite products. 
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Lemma 6.3 (Basic Lemma for Categorical Lax Logical Relations). Let 

M and N be models of L in a eartesian closed category C, S he a finitely com- 
plete cartesian closed category, and G:C ^ S preserve finite products up to 
isomorphism; then a family of sub-objects Ry. • ► — ► G{Ma) x G{Na) for every 
type a of L determines a lax logical relation from M to N if and only if for 
every term t of L of type a in context F, there exists a unique map q that makes 
the following diagram commute: 






X G{niN,,) 



q- 

T 



Ra 



G{M{t)) X G{N{t)) 
G{M^) X G{N„) 



where u\, . . . ,Un is the sequence of types in F. 



Proof. For the forward direction, R maps F L t: a to a, map 
II 

■ ► G(Mn,a,) X G(Mn,a,) 



Ra 



G{M(t)) X G{N{t)) 
G{M^) X G{N„) 



The result follows because R, M and N preserve products. 

In the converse direction, the morphism part of the functor is determined by 
the assumed maps. Taking T h t: u to be a: (Tq, h\ ui h (a, 6): (Tq x cti shows that 
Rno X i?cri < Rcroxcrn using the fact that G, M and N all preserve products, 
and taking T h t: u to be p\ uq x u\ \~ nip: Ui for i = 0,1 shows the converse. 
Finally, taking T h t: tr to be 0 h 1 shows that Ri is the “true” sub-object of 
G{Mi) X G{Ni). So R preserves products. ■ 



This result can be generalised: replace G-Rel 2 and (<5o><^i) by any category 
D with finite products and a faithful finite-product preserving functor to C x C. 
This would amount to a lax version of Peter Freyd’s suggestion IMi9Ul Sec- 
tion 3.6.4] of studying logical relations as subcategories that respect cartesian- 
closed (here, cartesian) structure, except generalised from subcategory to faithful 
functor. But many applications require entailments to, or from, the “relations,” 
and so a lax version of Hermida’s fibrations with structure to support a 

(T, A, =>, V) logic might be a more appropriate level of generality. 

To consider composition of (binary) lax logical relations in this context, as- 
sume first that S is the usual category of sets and functions; then the objects of 
G-Rel 2 are subsets of sets of the form G{a) x G{b). 

Proposition 6.4. Composition of (binary) categorical lax logical relations can 
be defined component-wise. 

To allow recursion in L, consider again the category Sub 2 {C, M) discussed at 
the end of Section El with C being the category of w-cpos with T and M being 
the admissible monos. Using the sconing functor G = C{1,—):C — >■ Set gives 
us a category G-Reh as above; because this is constructed as a pullback, there 
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exists a strict finite-product preserving functor F from Sub2{C, M) to G-Rel2- 
Given any logical relation functor R: L ^ Sub2{C, M), composing with F gives 
a strict finite-product preserving functor from L to G-Rel2 (i-e., a lax logical 
relation) between the original models. This shows how composition is supported 
in the context of relations on w-cpos. 

More generally, if S is assumed to be a regular category |fjo94) . a relational 
composition can be defined. Any pre-sheaf category Set^ , or indeed any topos, 
is a regular category, so this is a mild assumption. An axiomatic treatment 
of composition of generalized l ogical rel ations, including lax logical relations as 
discussed here, can be found in !K0+97| . which emerged from category theoretic 
treatments of data refinement in which composition of refinements is crucial 

7 Generalising from the A-Calculus 

In Sect. 0 the fundamental facts that gave rise to our definition of lax logical 
relation were the correspondence between the simply typed A-calculus and carte- 
sian closed categories, and the fact that a signature E gave rise to a cartesian 
closed category L such that a model of E could be seen as a functor from L into 
Set (or, more generally, any cartesian closed category) that strictly preserved 
cartesian closed structure. So in generalising from the simply typed A-calculus, 
we generalise the latter fact. This may be done in terms of algebraic structure, or 
equivalently (finitary) monads, on Cat. The central paper about that is Black- 
well, Kelly and Power’s (HEESSI. We can avoid much of the subtlety here by 
restricting our attention to maps that preserve structure strictly. 

We shall first describe the situation for an arbitrary (finitary) monad T on 
Cat extending finite-product structure. One requires Set (or, more generally, any 
small category C) to have T-structure, L to be the free T-algebra gener ated by a 
signature, and define a model M of L to be a strict T-algebra map, cf. |KO+97j . 

A natural general setting in which to define the notion of lax logical relation 
involves assuming the existence of a small category E (with finite products) of 
relations, and a strict finite-product preserving forgetful functor (Jq) <5i) from E 
to C X C. One then adds to these data further categorical structure inside the 
category of small categories and functors that strictly preserve finite products 
to generalise the composition of binary relations. These definitions and related 
results appear in |K0+97| . Here, we aim to state a Basic Lemma in familiar 
terms, and so restrict attention to the special case that C = Set and E = i?e?2- 

A lax logical relation is a strict finite-product preserving functor from L into 
Rel2 such that composition with (i5o,^i) yields {M,N). 

We can generalise the Basic Lemma to this level of generality as follows. 

Lemma 7.1 (Basic Lemma for Lax Logical Relations with Algebraic 
Structure). A family of relations R„ C x for every type a of L deter- 
mines a lax logical relation from M to N if and only if, for every term t of L of 
type a in context F, if x Rp y, then M{F h t: a)x R^ A(T h <: a)y 

The proof is exactly as in Sect. 01 similarly. 
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Proposition 7.2. Binary lax logical relations (at the current level of generality) 
compose component-wise. 

In the above we have tacitly assumed that contexts are modelled by finite 
products. In general, there is no need to make this assumption: contexts could 
be modelled by a symmetric monoidal structure or, more generally, by a sym- 
metric pre-monoidal structure, or Freyd structure. It would be straightforward 
to generalise our analysis to include such possibilities, but it may be simpler to 
deal with them case by case. For an analysis of the notion of lax logical relations 
where contexts are modelled by a Freyd structure, see P™i - 

We next want to generalise Theorem EH In order to do that, we need to 
consider the formulation of finitary monads in terms of algebraic structure on 
Cat, and we need to restrict to a particular class of such structures. The general 
notion of algebraic structure, and the relevant results, appear in IK and 
fFo97| . and we have included it in the Appendix. Using the notation of the 
Appendix, we consider a special class of algebraic structure. 

Definition 7.3. Algebraic structure {S,E) on Cat is discrete if S{c) = 0 when- 
ever c is not a discrete category, i.e., whenever c is not the discrete category on 
a finite set. 

It follows from the definition that any discrete algebraic structure may be pre- 
sented by two families of operations: object operations, which have algebras given 
by functors of the form C^ — > C, and arrow operations, which are given by nat- 
ural transformations between object operations. One may put equations between 
these to obtain all operations of any discrete algebraic structure, which are given 
by functors C^ — >■ C^^ , where Sk is, & small category. 

Assuming Set has {S, U)-structure for some given discrete algebraic structure 
(S,E), a model of an (S', A)-algebra in Set is a functor that strictly preserves 
(S, if)-structure. 

Examples of discrete algebraic structure have models given by small cat- 
egories with finite products, with finite coproducts, with monoidal structure, 
symmetric monoidal structure, a monad IMo91l . an endofunctor, a natural trans- 
formation between endofunctors, or any combination of the above. 

In order to extend Theorem 1^ rather than give an analogue, we must in- 
clude exponentials, although they are not instances of discrete algebraic structure 
as we have defined it. So we henceforth assume that we are given discrete alge- 
braic structure on Cat extending finite-product structure; that L is generated 
by the simply typed A-calculus, a signature, and the discrete algebraic structure; 
that M and N are models of L in Set strictly preserving the algebraic structure, 
and, restricting our definition above, that a lax logical relation from M to N is 
a finite-product preserving functor from L to i?e ?2 such that composition with 
(Jo,<5i) yields (M,N). 

A methodology for extending Theorem 16. 1 l is as follows. Algebraic structure 
on Cat is given by an equational presentation. That equational presentation has 
operations defining objects and arrows. For each operation defining an arrow, one 
adds an axiom to the list in the statement of Theorem 16. Il in the same spirit. For 
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instance, to define a monoidal structure, one has operations that assign to each 
pair of maps (/,(/), a map f ® g, and gives associative maps and their inverses, 
and left and right unit maps and their inverses. So one would add axioms 

• if /o 90 and /i 9i then (/o 0 /i) (so 0 9i)', 

• G R(a0r)(S>p—^a(S>(r0p) a, I R(^a0I)—^a ^ and T R(^J0a)—^a 

• G Ra(^{r(Sip)—>{t70r)(S’p ^ 7 ^ R{7—>(a(^I) ^ and T Ra—^{I(^<7) ^ 



Theorem 7.4. For any discrete algebraic structure on Cat extending finite- 
product structure, to give a lax logical relation from M to N is equivalent to giving 
a family of relations C M„- x satisfying the conditions of Theorem, I, ‘i. /I 
and also 

• for any k-ary object operation O, if fi 9i for all 1 < i < k, then 

0{fl, ■ ■ ■ , fk) Ro{(Ji,...,crk)^0{Ti,...,Tk) ^{git ■ ■ ■ ) 9k) 

• for any k-ary arrow operation O, we have 

0{Mai, Muk) Rj 0{Nai, Ncjk) 

where 7 = dom 0 (cti, . . . , Gk) — > cod 0{ai , . . . , ak) 

• for any k-ary arrow operation O, if fi R^^^n 9i for all 1 < i < k, then 

domO(/i, ...,fk)R/3 domO(gi, ...,gk) 

where [3 = domO((Ti , . . . ,(Jk) — > dom 0 (ti, . . . ,Tk), and similarly with dom 
systematically replaced by cod. 

The final two rules here may seem unfamiliar at first sight. An arrow operation 
takes a fc-ary family of objects to an arrow, so syntactically, takes a fc-ary family 
of types to an equivalence class of terms. That leads to our penultimate rule. 
That a fc-ary arrow operation is functorial means that every fc-ary family of 
arrows is sent to a commutative square. So we need rules to the effect that every 
arrow in that square behaves as required. The penultimate rule above accounts 
for two arrows of the square, and the final rule accounts for the other two, 
where the domain of the commutative square is the arrow of the square uniquely 
determined by the definitions. 

Proof. Follow the proof of Theorem 15. IL The conditions show inductively that 
for every arrow of the category freely generated by the given discrete algebraic 
structure applied to the signature, one obtains an arrow in Rel. ■ 

Note that this allows for dropping exponentials, as well as adding various kinds of 
structure such as finite co-products and tensor products. We hope this generality 
will lead to interesting new applications. 



98 



G. Plotkin et al. 



References 



[Ab90] 

[A195] 

[BKP89] 

[Bo94] 

[Cu93] 

[FRA99] 

[Gi68] 

[He93] 

[HL+] 

[HS99] 

[JH90] 

[JT93] 

[KO+97] 



[KP] 

[KP93] 

[KP96] 

[KP99] 

[La88] 



S. Abramsky. Abstract interpretation, logical relations and Kan extensions. 
J. of Logic and Computation, 1:5-40, 1990. 

M. Alimohamed. A characterization of lambda definability in categorical 
models of implicit polymorphism. Theoretical Computer Science, 146:5-23, 
1995. 

R. Blackwell, H. M. Kelly, and A. J. Power. Two dimensional monad theory. 
J. of Pure and Applied Algebra, 59:1-41, 1989. 

Francis Borceux. Handbook of Categorical Algebra 2, volume 51 of Ency- 
clopedia of Mathematics and its Applications. Cambridge University Press, 
1994. 

P.-L. Curien. Categorical Combinators, Sequential Algorithms, and Func- 
tional Programming. Birkhauser, Boston, 1993. 

J. Flum and M. Rodriguez- Artalejo, editors. Computer Science Logic, 13th 
International Workshop, CSL ’99, volume 1683 of Lecture Notes in Computer 
Science, Madrid, Spain, September 1999. Springer- Verlag, Berlin (1999). 

A. Ginzburg. Algebraic Theory of Automata. Academic Press, 1968. 
Claudio A. Hermida. Fibrations, logical predicates, and indeterminates. 
Ph.D. thesis. The University of Edinburgh, 1993. Available as Computer 
Science Report CST-103-93 or ECS-LFCS-93-277. 

F. Honsell, J. Longley, D. Sannella, and A. Tarlecki. Constructive data 
refinement in typed lambda calculus. To appear in the Proceedings of 
FOSSACS 2000, Springer- Verlag Lecture Notes in Computer Science. 

F. Honsell and D. Sannella. Pre-logical relations. In Flum and Rodriguez- 
Art alejo mm, pages 546-561. 

He Jifeng and C. A. R. Hoare. Data refinement in a categorical setting. 
Technical monograph PRG-90, Oxford University Computing Laboratory, 
Programming Research Group, Oxford, November 1990. 

A. Jung and J. Tiuryn. A new characterization of lambda definability. In 
M. Bezen and J. F. Groote, editors, Typed Lambda Calculi and Applications, 
volume 664 of Lecture Notes in Computer Science, pages 245-257, Utrecht, 
The Netherlands, March 1993. Springer- Verlag, Berlin. 

Y. Kinoshita, P. O’Hearn, A. J. Power, M. Takeyama, and R. D. Tennent. 
An axiomatic approach to binary logical relations with applications to data 
refinement. In M. Abadi and T. Ito, editors. Theoretical Aspects of Computer 
Software, volume 1281 of Lecture Notes in Computer Science, pages 191-212, 
Sendai, Japan, 1997. Springer- Verlag, Berlin. 

Y. Kinoshita and A. J. Power. Data refinement by enrichment of algebraic 
structure. To appear in Acta Informatica. 

G. M. Kelly and A. J. Power. Adjunctions whose counits are coequalizers, 
and presentations of finitary enriched monads. Journal of Pure and Applied 
Algebra, 89:163-179, 1993. 

Y. Kinoshita and A. J. Power. Lax naturality through enrichment. J. Pure 
and Applied Algebra, 112:53-72, 1996. 

Y. Kinoshita and J. Power. Data refinement for call-by-value programming 
languages. In Flum and Rodriguez-Artalejo lh’RA99l . pages 562-576. 

Y. Lafont. Logiques, Categories et Machines. These de Doctoral, Universite 
de Paris VII, 1988. 



Lax Logical Relations 



99 



[Mi71] 

[Mi90] 

[Mi91] 

[Mi96] 

[Mo91] 

[MR91] 

[MS76] 

[MS92] 

[OR95] 

[OT95] 

[P173] 

[P180] 

[Po97] 

[Re74] 

[Re83] 

[Sc87] 



R. Milner. An algebraic definition of simulation between programs. In 
Proceedings of the Second International Joint Conference on Artificial Intel- 
ligence, pages 481-489. The British Compnter Society, London, 1971. Also 
Technical Report CS-205, Computer Science Department, Stanford Univer- 
sity, February 1971. 

J. C. Mitchell. Type systems for programming languages. In J. van Leeuwen, 
editor. Handbook of Theoretical Computer Science, volume B, pages 365-458. 
Elsevier, Amsterdam, and The MIT Press, Cambridge, Mass., 1990. 

J. C. Mitchell. On the equivalence of data representations. In V. Lifschitz, 
editor, Artifieial Intelligence and Mathematical Theory of Computation: Pa- 
pers in Honor of John McCarthy, pages 305-330. Academic Press, 1991. 

J. C. Mitchell. Foundations for Programming Languages. The MIT Press, 
1996. 

Eugenio Moggi. Notions of computation and monads. Information and 
Computation, 93(l):55-92, July 1991. 

QingMing Ma and J. C. Reynolds. Types, abstraction, and parametric poly- 
morphism, part 2. In S. Brookes, M. Main, A. Melton, M. Mislove, and 
D. Schmidt, editors, Mathematieal Foundations of Programming Semantics, 
Proceedings of the 7th International Conference, volume 598 of Lecture Notes 
in Computer Science, pages 1-40, Pittsburgh, PA, March 1991. Springer- 
Verlag, Berlin (1992). 

R. E. Milne and C. Strachey. A Theory of Programming Language Semantics. 
Chapman and Hall, London, and Wiley, New York, 1976. 

J. C. Mitchell and A. Scedrov. Notes on sconing and relators. In E. Borger, 
G. Jager, H. Kleine Brining, S. Martini, and M. M. Richter, editors. Com- 
puter Science Logie: 6th Workshop, CSL ’92: Selected Papers, volume 702 
of Lecture Notes in Computer Science, pages 352-378, San Miniato, Italy, 
1992. Springer- Verlag, Berlin (1993). 

P. O’Hearn and J. Riecke. Kripke logical relations and PCF. Information 
and Computation, 120(1):107-116, 1995. 

P. W. O’Hearn and R. D. Tennent. Parametricity and local variables. 
J. ACM, 42(3):658-709, May 1995. 

G. D. Plotkin. Lambda-definability and logical relations. Memorandum SAI- 
RM-4, School of Artificial Intelligence, University of Edinburgh, October 
1973. 

G. D. Plotkin. Lambda-definability in the full type hierarchy. In J. P. Seldin 
and J. R. Hindley, editors. To H. B. Curry: Essays in Combinatory Logic, 
Lambda Calculus and Formalism, pages 363-373. Academic Press, 1980. 

A. J. Power. Categories with algebraic structure. In M. Nielsen and 
W. Thomas, editors. Computer Science Logic, 11th International Workshop, 
CSL’99, volume 1414 of Lecture Notes in Computer Science, pages 389-405, 
Aarhus, Denmark, August 1997. Springer- Verlag, Berlin (1998). 

J. C. Reynolds. On the relation between direct and continuation semantics. 
In J. Loeckx, editor, Proe. 2nd Int. Colloq. on Automata, Languages and 
Programming, volume 14 of Lecture Notes in Computer Scienee, pages 141- 
156. Springer- Verlag, Berlin, 1974. 

J. C. Reynolds. Types, abstraction and parametric polymorphism. In 
R. E. A. Mason, editor, Information Processing 83, pages 513-523, Paris, 
France, 1983. North-Holland, Amsterdam. 

O. Schoett. Data abstraction and the correctness of modular programming. 
Ph.D. thesis. University of Edinburgh, February 1987. Report CST-42-87. 




100 



G. Plotkin et al. 



[St96] I. Stark. Categorical models for local names. Lisp and Symbolic Computa- 
tion, 9(1):77-107, February 1996. 



Appendix: Algebraic Structure on Categories 



In ordinary universal algebra, an algebra is a set X together with a family of basic 
operations ct:X" — >■ X, subject to equations between derived operations. In order to 
define algebraic structure on categories, one must replace the set X by a category A. 
One also replaces the finite number n by a finitely presentable category c. All finite 
categories are finitely presentable, and finite categories are the only finitely presentable 
categories we need in this paper. One also allows not only functions from the set 
Cat{c, A) into the set of objects of A, but also functions from the set Cat{c, A) into 
the set of arrows in A. These are subject to equations between derived operations. 
It follows that the category of small such categories with structure and functors that 
strictly preserve the structure is equivalent to the category of algebras, T-Alg, for a 
finitary monad T on Cat. 

All structures relevant to this paper are instances of a slightly more restricted sit- 
uation: that of Caf-enriched algebraic structure. So we shall restrict to Gat-enriched 
structures here. Let C denote the 2-category Cat of small categories. So C{A, B) de- 
notes the category of functors from A to B. Let G/ denote the full sub-2-category of 
G given by (isomorphism classes of) finitely presentable categories. 

Definition A.l. A signature on C is a 2-functor S:ohCf — > C, regarding oh Cf as 
a discrete 2-category. 

For each c G obG/, S(c) is called the category of basic operations of arity c. Using S, 
we construct Suj.Cf — > C as follows: set 

So = J, the inclusion of G/ into G, and 

S„+1 = J + X5(d); 

and define 

uq: So — >■ Si to be inj: J — > J -\- X^deobCj >S'o(— )) x S(d); and 
Un+i: S„+1 ->• Sn +2 to be J -I- X^dGobCj- X S{d). 

Then Sui = colim„<,,; S„, where the colimit exists because G is cocomplete, and it is 
a colimit in a functor category with base G. In many cases of interest, each cr„ is a 
monomorphism, so Sui is the union of For each c, we call Sui{c) the category 

of derived c-ary operations. 

A signature is typically accompanied by equations between derived operations. So 
we say 

Definition A. 2. The equations of an algebraie theory with signature S are given 
by a 2-functor E:ohCf — > C together with 2-natural transformations ti,T 2 '.E — > 
S„(A(-)), where K:ohCf — > G/ is the inclusion. 



Definition A. 3. Algebraic structure on C consists of a signature S, together with 
equations (E,ti,T 2 ). 
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We generally denote algebraic structure by (S,E), suppressing ri and T 2 - 
We now define the algebras for a given algebraic structure. 

Definition A. 4. Given a signature S, an S'-algebra consists of a small category A 
together with a functor Vc- C(c, A) — C(^S{c),A^ for each c. 

So, an S'-algebra consists of a carrier A and an interpretation of the basic operations of 
the signature. This interpretation extends canonically to the derived operations, giving 
an S',,; (A(—)) -algebra, as follows. 

• no-C{c,A) — >• d(So(c),A) is the identity; 

• using the fact that C{—,A) preserves colimits, to give a functor Un+i from C(c, A) 
to C(Sn-i-i(c), a) is equivalent to giving a functor from C{c,A) to C{c,A), which 
we will make the identity, and, for each d in obC/, a functor from C{c,A) to 
C{C{d, Sn{c )) , C(^S{d), A)) or, equivalently, a functor from C(c, A) x C(^d, S„(c)^ 
to C(^S(d),A^ which can be inductively defined by 

C(c,A) X C(d,Sn(c)) 

Vn X id 

C(S„(c),A) X C(d,S„(c)) 
comp 
C{d, A) 

Vd 

C{S{d),A) 



Definition A. 5. Given algebraic structure (S,E), an (S, f5)-algebra is an S-algebra 
that satisfies the equations, i.e., an S-algebra {A,v) such that both legs of 



agree. 



C{c,A) """ ^ C{Su,{Kc),A) 



g(ric,A) 

C(t2c,A) 



C{E{c),A) 



Given (S, i5)-algebras (A,u) and (B,5), we define the hom-category 

iS,E)-Alg{{A,n),{B,S)) 

to be the equaliser in C of 



C{A,B) 



{C(S(c),-)}ceobc, 



n,C(C(c,A),C(c,B)) 



{G(S(c),-)},,„,^^ 



n,C(G(c,A),5,) 



n,C(G(S(c),A),C7(S(c),B)) 



Y\^C{u.,C{S{c),B)) 



n,C7(C7(c,A),G(S(c),i3)) 
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This agrees with our usual universal-algebraic understanding of the notion of homomor- 
phism of algebras, internalising it to C. (5, E)-Alg can then be made into a 2-category 
in which composition is induced by that in C. An arrow in {S,E)-Alg is a functor 
F: A ^ B such that, for all finitely presentable c, 

Fu,{-) = 5,{F-y. C(c, A) C{S{c), B) 

i.e., a functor that commutes with all basic c-ary operations for all c. 

A special case of the main result of [KPO.Ij says 

Theorem A. 6. A 2-category is equivalent to {S, E)-Alg for algebraic structure {S,E) 
on C if and only if there is a finitary 2-monad T on C such that the 2-category is 
equivalent to T-Alg. 

See [Po97) for an account directed towards a computer science readership. 
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Abstract. We explain how recent developments in game semantics can 
be applied to reasoning about equivalence of terms in a non-trivial frag- 
ment of Idealized Algol (IA) by expressing sets of complete plays as 
regular languages. Being derived directly from the fully abstract game 
semantics for IA, our method of reasoning inherits its desirable theoret- 
ical properties. The method is mathematically elementary and formal, 
which makes it uniquely suitable for automation. We show that reason- 
ing can be carried out using only a meta-language of extended regular 
expressions, a language for which equivalence is formally decidable. 
Keywords: Game semantics, ALGOL-like languages, regular languages 



1 Introduction 

Reynolds’s Idealized Algol (IA) is a compact language which combines the 
fundamental features of procedural languages with a full higher-order procedure 
mechanism. This combination makes the language very expressive. For example, 
simple forms of classes and objects may be encoded in IA jH). For these reasons, 
IA has attracted a great deal of attention from theoreticians; some 20 papers 
spanning almost 20 years of research were recently collected in book form cm. 

A common theme in the literature on semantics of IA, beginning with 0 , is 
the use of putative program equivalences to test suitability of semantic models. 
These example equivalences are intended to capture intuitively valid principles 
such as the privacy of local variables, irreversibility of state-changes and repre- 
sentation independence. A good model should support these intuitions. 

Over the years, a variety of models have been proposed, each of which went 
some way towards formalizing programming intuition: functor categories gave an 
account of variable allocation and deallocation CH, relational parametricity was 
employed to capture representation-independence properties 0 , and linear logic 
to explain irreversibility 0 • Recently, many of these ideas have been successfully 
incorporated in an operationally-based account of IA by Pitts m- 

* This author acknowledges the support of a PGSB grant from the Natural Sciences 
and Engineering Research Gouncil of Canada. This paper was written while visiting 
University of Edinburgh, Laboratory for Foundations of Computer Science. 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 103-|11^ 2000. 
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A frustrating situation was created with the development of a fully abstract 
game semantics for lA p. The full abstraction result means that the model 
validates all correct equivalences between programs, but unfortunately the model 
as originally presented is complicated, and calculating and reasoning within the 
model is difficult. 

In this paper, we show that if one restricts attention to the second-order 
subset of lA, the games model can be simplified dramatically: terms now de- 
note regular languages, and a relatively straightforward notation can be used to 
describe and calculate with the simplified semantics. The fragment of lA which 
we consider contains almost all the example equivalences from the literature, 
and we are therefore able to validate them in a largely calculational, algebraic 
style, using our semantics. We also obtain a decidability result for equivalence 
of programs in this fragment. 

The approach of game semantics, and therefore of this paper, has little 
in common with the traditional semantics of lA. Intuitively it comes closest 
to Reddy’s “object semantics” H3! and Brookes ’s trace semantics for shared- 
variable concurrent Algol |2j. Identifiers are not interpreted using an environ- 
ment, variables are not interpreted using a notion of store and functions in the 
language are not interpreted using a mathematical notion of function. Instead, 
we are primarily concerned with behaviour, with all the possible actions that 
can be associated with every such language entity. Meanings of phrases are then 
constructed combinatorially according to the semantic rules of the language. 

We believe our new presentation of game semantics is elementary enough to 
be considered a potential “popular semantics” [SI; it should at least provide a 
point of entry to game semantics for those who have previously found the subject 
opaque. Moreover, the property of full abstraction together with the fact that 
reasoning can be carried out in a decidable formal language suggest that our 
approach constitutes a good foundation on which an automatic program checker 
for lA and related languages can be constructed. The idea of using game se- 
mantics to support automated program analysis has already been independently 
explored in a more general framework by Hankin and Malacaria m- They used 
such models to derive static analysis algorithms which can be described without 
reference to games. 

2 The lA Fragment 

The principles of the programming language I A were laid down by John Reynolds 
in an influential paper H5|. lA is a language that combines imperative features 
with a procedure mechanism based on a typed call-by-name lambda calculus; 
local variables obey a stack discipline, having a lifetime dictated by syntactic 
scope; expressions, including procedures returning a value, cannot have side ef- 
fects, i.e. they cannot assign to variables. We conform to these principles, except 
for the last one. This flavour of lA is known as lA with active expressions and 
has been analyzed extensively lEcig- We consider only the recursion-free sec- 
ond order fragment of this language, the fragment which has been used to give 
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virtually all the significant equivalences mentioned in the literature. In addition, 
we will only deal with finite data sets. 

The data types of the language {i.e. types of data assignable to variables) 
are a finite subset of the integers, and booleans: 

r ::= int | bool 

The phrase types of the language are those of commands, variables and expres- 
sions, plus function types. 

a ::= comm | var[r] | exp[r] , 9 ::= a \ a ^ 6 

Note that we include only first-order function types here. We will consider only 
terms of the form 

Li : 9i, . . . ,Lk ■■ 9k\- M : a 

that is, terms of ground type with free variables of arbitrary first-order type. 
For the sake of simplicity in this paper, we also assume that M is /3-normal, 
so that it contains no A-abstractions. Function application is restricted to free 
identifiers t. This last restriction can easily be removed, but at the expense of 
undue notational overhead in the semantics. 

The terms of the language are as follows. In type comm there are basic 
commands skip, to do nothing, and 17 to diverge; in type exp [int] the finitary 
fragment contains constants n belonging to a finite subset Af of the set of integers; 
and in type exp[bool] there are the constants true and false. There are term 
formers for assignment to variables, V := E, dereferencing variables, IV, sequen- 
tial composition of commands C;C' , and sequential composition of a command 
with an expression to yield a possibly side-effecting expression C; E. We have a 
conditional operation if B then C else C , a while- loop while B do C, appli- 
cation of first-order identifiers to arguments lMi . . . Mk, and the local- variable 
declaration new[r] t in C. Here, the free variable l : var[r] of C becomes bound. 
Finally, we assume the usual range of binary operations on integer and boolean 
expressions. 

3 Game Semantics of Idealized Algol 

In game semantics, a computation is represented as an interaction between two 
protagonists: Player (P) represents the program, and Opponent (O) represents 
the environment or context in which the program runs. For example, for a pro- 
gram of the form 

L : exp [int] — comm h M : comm , 

Player will represent the program M; Opponent represents the context, in this 
case the non-local procedure l. This procedure, if called by M, may in turn call 
an argument, in which case O will ask P to provide this information. 

The interaction between O and P consists of a sequence of moves, alternating 
between players. In the game for the type comm, for example, there is an initial 
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move run to initiate a command, and a single response done to signal termi- 
nation. Thus a simple interaction corresponding to the command skip might 
be 

O: run (start executing) 

P: done (immediately terminate). 

In more interesting games, such as the one used to interpret programs like 
i : exp[int] — > comm h t(0) : comm , 

there are more moves. Corresponding to the result type comm, there are the 
moves run and done. The program needs to run the procedure i, so there are 
also moves run^ and done^ to represent that; here the run^ move is a move for 
P, and done^ is a move for O. Finally, the procedure i may need to evaluate its 
argument. For this purpose, O has a move q}, meaning “what is the value of the 
first argument to i?”, to which P may respond with an integer n, tagged as n) 
for the sake of identification. 

Here is a sample interaction in the interpretation of the above term. 

O: run (start executing) 

P: run,, (execute l) 

O: ql (what is the first argument to i?) 

P: O) (the argument is 0) 

O: done I (t terminates) 

P: done (whole command terminates). 

In the above interaction, at the third move, O was not compelled to ask for the 
argument to l\ if O represented a non-strict procedure, the move done^ would 
be played immediately. Similarly, at the fifth move, O could repeat the question 
to represent a procedure which calls its argument more than once. 

Strategies. Using the above ideas, each possible execution of a program is repre- 
sented as a sequence of moves in the appropriate game. A program can therefore 
be represented as a strategy for P, that is, a predetermined way of responding to 
the moves O makes. A strategy can also choose to make no response in a partic- 
ular situation, representing divergence, so for example there are two strategies 
for the game corresponding to comm: the strategy for skip responds to run 
with done, and the strategy for Q fails to respond to run at all. 

Strategies are usually represented as sets of sequences of moves, so that a 
strategy is identified with the collection of possible traces that can arise if P 
plays according to that strategy. The fact that O can repeat questions, as we 
remarked above, means that these sets are very often infinite, even for simple 
programs. The strategy for the program t(0), for example, is capable of supplying 
the argument 0 to t as often as O asks for it. 

Interpretation of Variables. The type var[r] is represented as a game in 
the following way. For each element x of r there is an initial move write (x), 
representing an assignment. There is one possible response to this move, ok, 
which signals successful completion of the assignment. For dereferencing, there 
is an initial move read, to which P may respond with any element of t. 
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Here is an interaction in the strategy for 
V : var[int] h v :=lv + 1. 



O: run 
P: ready 

O: 3 

P: write (4) I 
O: oky 
P: done 



(get the value from v) 

(O supplies the value 3) 

(write 4 into v) 

(the assignment is complete) 

(the whole command is complete) 

In these interactions, O is not constrained to play a good variable in v, i.e. to 
exhibit the expected causal dependency between reads and writes. For example, 
in the game for terms of the form 

c : comm, v : var[int] h M : comm , 



we find interactions such as 



run-ready-3y-write{4:)y - oky-runc‘ donec-ready-7y ■ ■ ■ 



Here O has not played a good variable in v, but this freedom is necessary. Our 
semantics must take care of the case in which i is bound to a procedure which 
also uses v, for example, the procedure v := 7. 

There is one situation in which this kind of interference cannot happen: 
when the variable v is made local. This has two effects. The local interaction 
with V is guaranteed to exhibit “good variable” behaviour, and the interaction 
with V is not an observable part of the programs behaviour. Therefore, the games 
interpretation of new i; in M is given by taking the set of sequences interpreting 
M, considering only those in which O plays a good variable in v, and deleting 
all the moves pertaining to v, to hide v from the outside. 

Full abstraction. In it was shown that games give rise to a fully abstract 
model of lA, in the following sense. Say that an interaction is complete if and 
only if it begins with an initial move and ends with a move which answers that 
initial move. Thus, for example, run-run,, is not complete but run-run v done,,- done 
is. Then we have the following theorem: 

Theorem 1 (Full Abstraction for lA). For any F \- P,Q : 9, programs 
P and Q are contextually equivalent in lA (P = Q) if and only if the sets of 
complete plays in the strategies interpreting P and Q are equal. 

Note. In the above account, a very simple notion of game has been used. In 
fact, games models require a great deal more machinery, including the notions 
of justification pointer and questions and answers, in order for full abstraction 
to be achieved. The key observation which makes the present paper possible is 
that, for the interpretation of lA up to second-order types, this extra machinery 
is redundant; it only comes into play at third-order and above. 
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4 Regular Language Game Semantics 

We will now give a simple presentation of the game semantics of our fragment of 
lA. The key idea is that the set of complete plays in a strategy forms a regular 
language, which leads to a compact notation for defining and manipulating these 
infinite sets of sequences. We define a metalanguage based on regular expressions, 
extended with two handy operations: intersection and hiding. Of course, these 
extensions do not change the regular nature of the languages being defined. 

Definition 1. The set TZa of extended regular expressions over a finite alphabet 
A is defined inductively as the smallest set for which: 

Constants: -L, e G TZa; if a € A, then a G TZa; 

Iteration: if R € TZa> R* G R-A! 

Operators: if R, S G IZa> then R-S,R+ S,Rr\ S G IZa! 

Hiding: if R € TZa> C A, then R \a'^ R-a! 

The constant T denotes the empty language, while e is the language consisting 
only of the empty string. The constant a is the language consisting of the sin- 
gleton sequence a. Hiding represents the operation of restricting a language to a 
subset A \ A' of the original alphabet A: the language £{R |_ 4 /) is the set of se- 
quences in £{R), with all elements of A' deleted. The other operations (iteration, 
concatenation, union, intersection) are defined as usual. 

Proposition 1. Every extended regular expression denotes a regular language. 

We now give a regular language representation of the game semantics for lA. 
An alphabet is associated with every type in lA. They represent a semantic “do- 
main” over which regular languages will be constructed, using extended regular 
expressions: 

A|int]=A/’, A\ho6\\ = {true .false} 

A|comm] = [run. done}. 

A|exp[r]] = {q.v\v G A|r]}, 

A|var[r]] = {read, v, write (v). ok \ v G A|r]}, 

A[ai — >■ (72 cTfe cr] = {a* | a G A|CTi] , 1 < f < fc} U A\a\ . 

By a* we mean a lexical operation: the creation of a new symbol by tagging the 
symbol a with the numeral k. 

For a term of the form 

Li : 01, i2 ■■ O 2 , ... Lk ■■ Ok M : a 

we define the context alphabet to be the set 

IJ {a,^ I a G Al0jj} 
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that is, the union of the alphabets, every symbol tagged with the corre- 

sponding identifier. 

The semantics of a term M as above is then a regular language of a certain 
form, defined as follows. 

— If cr = comm, |M] = run- R m ■ done. 

— If o- = exp[r], |M] = 

— If cr = var[r], 

m = E {read ■ R^ ■ v) + {write{v)-S^-ok) 

where Rm, regular languages over the context alphabet of the 

term M. The idea is that, for M of type comm, for example, the regular language 
Rm is the set of interactions with the environment that need to take place for 
M to terminate. Similarly, is the set of interactions that an expression M 
must have with the environment to return a value of 3, and so on. For M of type 
var[r], R\j denotes the interactions required for a value v to be read from M, 
and Sif denotes the interactions needed to write v into M. 

These regular languages, denoted by Rm, Rm, ^m, form the substance of 
our interpretation of the language; the moves that bracket them, such as run, 
done for commands, are merely delimiters to indicate that a complete play has 
occurred. The definitions needed to interpret most of our language are given in 
Table El 



Table 1. Some semantic valuations 



Rskip = e Rfi = T Rv = e RJJ = T {v ^ v') 

Ri-.coTnm ~ ruTl,^ • donCi, .^t:exp[r] ~ Qt ' Ut. 



i?“var[T] = readi ■ 5'"var[T] = write{v)L ■ ok^ 

Rwhile B do C = {R^r ■ RmT ■ R^^ Rei-\-E2 — ^ ^ ^e\ ' ^E2 

ni+n2=n 



Tjtrue \ '' jpTi -r^n 

riEi=E2 — / riEi’^E2 


■nfalse \ ^ 

^E^=E2 - / . 


ryni Ty7l2 
^Ei ' ^E2 


n£j\f 


nl^n2 




p , _ rytrue 

B then C else C' — 


■ Rc -t Rf"" ■ Re 


Rc-,c 



Rfo = Rl Rv,=m = ^Rm-SI 



For instance, a trace of U \= E consists of run and done surrounding the 
effects of the assignment: first which is the regular language denoting the 
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interaction which leads the expression E to return value v, and then Sy which 
is the regular language denoting the interaction required to write value v into 
variable V. 

A trace of a while-loop has the form: some number of repetitions of a trace 
of the guard which produces true followed by a complete trace of the loop body, 
then, finally, a single trace of the guard producing false. Using our semantics, 
we can easily demonstrate the validity of a typical while-loop equivalence: 

[while true do C] 

= run ■ (i?^™ e -Rc)*- Rirul ■ done 
= run- {e- Rc)* ■ A- done 

= ± = M- 

The semantics of a free identifier t consist simply of querying the identifier. There 
is no need to look up the identifier in an environment, because the tagging of the 
trace with the name of the identifier ensures the proper correspondence between 
each identifier and its effects. Therefore, a notion of environment is not needed 
here at all. 

The semantics of application and of local variables have been omitted from 
Tabled because they deserve additional explanation. 



Application. Let t be a free variable of type cti ct 2 CTfc comm, 

and Ml, . . . , Mj- be terms of type a±, . . . , ak- The interpretation of the appli- 
cation lMi . . . Mk depends on the moves available, which depends on the types 
cTi, . . . , (Tfe. In the simplest case, when every aj is the type comm, we define 



Rt.Mi...Mk = "fo-nL ■ 



I run{ ■ Rmj ■ done{ 
\o=i 



done^. 



To illustrate a more complex case, we give the definition of the interpretation of 
lM where i has type var[int] — >■ exp[int]. 



Rm — Qi. 



f \ * 

read[ • RJf^ ■ n\ + write{n)\ ■ ■ ok\ J • 

\ n n / 



The large sums in this expression show that the environment chooses how to 
read and write from the argument to t, and that the term M determines what 
behaviour results from such reading and writing. 

In general, for a variable i : cri —>■ (T2 —>■■■• —>■ comm: 

/ \ * 



RiMi-.-Mk run^- 









• done^ 



where p{ is a relabeling operation that tags the initial and final moves of the 
arguments Mj, the bracketing indicating a complete play, with the identifier 
which is calling them and the position in which they are used: 

pI{R) = R[wl/w], for w G {run, done, q,v, read, write {v), ok \ v G A|r]}. 



Reasoning about Idealized Algol Using Regular Languages 111 



Local variables. For the semantics of a local variable block, as in the original 
game semantics, there are two things to do: restrict O’s behaviour to that of a 
good variable, and hide the interaction with the local variable. 

The regular language 7 ^ stipulates that the moves corresponding to l have 
good- variable behaviour. First, let ^|r]t be that part of the alphabet which 
concerns the variable i : var[r], that is, 

= {read,,v„write{v)„ ok, \ v S .A|r]}. 

Let B, = regular language containing all strings which 

do not contain any elements of -4|r]t. If we assume that variables initially hold 
some default value , then good-variable behaviour is stipulated as follows. 

= B,-{read,-al'B,)* -{ b,- ^ [write{v),-ok-B,-{read,-v,-B,)*)\ 

\ «eA[r] / 

For the sake of completeness, = 0 and = false. We can then give the 
semantics of blocks as 

'^newp] t in M “ FI^m) U[rL ’ 

Note that the same intersection and hiding can be used to define |new[r] l in M] 
directly from |M]: the bracketing moves, run and done, make no difference. 

|new[r] t in M] = ( 7 ^ n |M]) . 

Theorem 2. Full abstraction. Two terms of the recursion free second order 
finitary fragment of I A are equivalent (in full I A) if and only if the languages 
denoted by them are equal: 

For any Th P,Q:6, P = Q ^ |P] = |Q]. 

Proof. We can show that the regular language denoted by a term of lA is equal 
to the set of complete plays in the fully abstract game semantics P, therefore the 
full abstraction property is preserved. Note that language equivalence is asserted 
outside the fragment we describe here; witnesses to some inequivalences may 
belong to I A but not to the presented fragment. □ 

5 Examples of Reasoning 

At this point a skeptical reader may entertain doubts concerning our earlier claim 
of simplicity. We have set up a formal notation of extended regular expressions 
which includes rather complicated operations. However, the complications are 
notational and not conceptual. Also, all the operations involved are defined effec- 
tively so carrying them out is a mechanical process. We hope that the simplicity 
of our approach will become clearer when we show examples of reasoning about 
putative equivalences. 
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Locality. This most simple of equivalences invalidates models of imperative 
computation relying on a global store, traceable back to Scott and Strachey m 
It says that a globally defined procedure cannot modify a local variable, and 
it was first proved using the “possible worlds” model of Reynolds and Oles, 
constructed using functor categories El. 

P : comm h new x in P = P 



Proof. 

|new a; in P] = (7“ n |P]) 

= (7“ n run ■ run p- done p- done) 

= {run ■ run p- done p- done) 

because no moves are tagged by x 
= run ■ run p- done p- done 

= m 



Snapback. This example captures the intuition that changes to the state are in 
some way irreversible. A procedure executing an argument which is a command 
inflicts upon the state changes that cannot be undone from within the procedure. 
This is why, in the following, if procedure P uses its argument both sides will 
fail to terminate; if procedure P does not use its argument the behaviour of each 
side will be identical because of the locality of x, as seen above. The first model 
to address this is sue correctly was O’Hearn and Reynolds’s interpretation of 
lA using the polymorphic linear lambda calculus Reddy also addressed this 
issue using a novel “object semantics” approach El, but in a particular flavour 
of lA known as interference-controlled Algol |S|. A further development of this 
model, that also satisfies this equivalence, is O’Hearn and Reddy’s |Z], a model 
fully abstract for the second order subset. 

P : comm comm h 

new X in P{x := 1); if lx = 1 then 17 else skip = P(l7) 

Proof. 



\x := 1] = run-write{l)x- okx- done 

{P{x := 1)] = run-runp- (^run p ■ write (!) x- ok ^-done]^)* -done p- done 
|if \x = 1 then 17 else skip] = run ■ read x-nx- done 

n^l 

\P{x := 1); if !x = 1 then 17 else skip] 



= run- run p ■ {runp- write{l)x- okx ■ doncp) * • done p ■ 

n lP{x := 1); if la; = 1 then 17 else skip] 

= run ■ run p ■ done p ■ readx -Ox- done, 



readx- 

n^l 




- done 
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because the only possibility to complete a trace in fead^-rix is if the trace 

in (^run\,-write{l)x-ok^-done\,)* is the empty trace. Otherwise, the good variable 
property of x requires rix = lx, which is banned by the set to which n is restricted 
(n 1). The meaning of the left hand term of the equivalence is therefore: 

( 7 “ n 1 ^( 0 ; := 1 ); if la: = 1 then 17 else skip]) |_ 4 ^ 

= run ■ run p- done p- done = |P(I7)] 

Parametricity. The intuition of parametricity is one of representation inde- 
pendence. Procedures passed different but equivalent implementations of a data 
structure or algorithm are not supposed to be able to distinguish between them. 
Several such motivating examples are given by O’Hearn and Tennent [3, who 
introduce a model constructed using a certain relation-preserving functor cate- 
gory. 

The specific example we give is of the equivalence of two implementations 
of a toggle-switch: one which uses 1 for “on” and —1 for “off”, and one which 
uses true and false. The semantic equations for negation and the inequality test 
have not been spelled out but are the obvious ones. 

P : comm ^ exp [bool] — comm h 
new[int] a; in a; := l;P(x := — !x)(!x > 0) 

= new[bool] x in x := true ; P(x := not x)(!x) 

Proof. 

|x := —lx] = run- readx-nx-write{—n)x-okx- done 

|!x > 0| = q- read x-nx- true + q-readx-nx- false 

n>0 n<0 

|x := l;P(x := — !x)(!x > 0)| = run ■ write {l)x- ok x 

runp-l run\,-readx-nx-write{—n)x-okx-done\,+ 

q%-readx-nx-true\, -I- q^-readx-nx-false\, 

n>0 n<0 

iLt n b := 1; P{x := -!x)(!x > 0)] = 

= run-write{l)x- okx-runp-{e + X + X-Y + X-Y -X + . . .)■ done p ■ done 
= run-write{l)x- okx-runp- (X -|- {X-Y)*-{X -be)) - done p- done 
where X=run\,-readx-lx'WTite{^l)x- okx- done\,- {q^-readx-(^f)x'fO‘lse’]fj* 
dxidY=run\,-readx-(^l)x'WTite{l)x- okx- done\,- ((fjy-readx-{l)x‘true%)* 

Why this is the case should be intuitively clear. A value of 1 is written into x, 
followed by negation only, which constrains all the plays to (-l-l)a; and ( — l)a; 



V 

■ done p- done 
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only. The reads and writes have to match with the good variable behaviour. 
A fully formal proof is lengthier but trivial and mechanical. Restricting with 
Upntjx gives the following trace for the left hand side: 

run-runp- (A' + {X' -Y')* -{X' + e)) -donep-done 

where X' = run\, ■ done\, ■ {q%-false\,Y and Y' = run p- done p- {q%-true%Y 
A similar calculation on the right hand side leads to the the same result. 

6 Decidability and Complexity Issues 

As we have seen, regular languages provide a semantics for the fragment of lA 
described here. To manipulate regular languages we have introduced a formal 
meta-language of extended regular expressions, which preserves regularity of the 
language. All the operations we have used in formulating the semantic valuations 
have been effectively given. Therefore, we can formulate the following obvious 
result : 

Theorem 3 (Decidability). Equivalence of two terms of the recursion free 
second order finitary fragment of lA is decidable. 

For the general problem of term equivalence the complexity bound appears to 
be at least of exponential space, as is the case for regular expressions with in- 
tersection m- However, the complexity bound for the general problem may not 
be relevant for the kind of terms that arise in the model of lA, and particularly 
for those that would be checked for equivalence in practice. This point, which 
will be investigated in future work, is of the utmost importance if a tool is to be 
developed based on our ideas. 
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Abstract. We introduce the measurement idea in domain theory and 
then apply it to establish two fixed point theorems. The first is an ex- 
tension of the Scott fixed point theorem which applies to nonmonotonic 
mappings. The second is a contraction principle for monotone maps that 
guarantees the existence of unique fixed points. 



1 Introduction 

A measurement on a domain is a Scott continuous map : D ^ [0,oo)* 
into the nonnegative reals in their reverse order which formalizes the notion 
information content for objects in a domain. Intuitively, if x S D is an informa- 
tive object, then fax is the amount of information it contains. In another light, 
we may think of fa as measuring the disorder in an object, or entropy, since 
X U y => fax > fay, that is, the more informative an object is, the smaller its 
measure. 

After giving a precise definition of measurement and several natural exam- 
ples, we show the value of the idea by proving and then applying two fixed point 
theorems. The first is an extension of the Scott fixed point theorem which applies 
to nonmonotonic processes, like the bisection method and the r-section search. 
The second is a contraction principle for monotone maps that guarantees the ex- 
istence of unique fixed points, as opposed to the least fixed points that domain 
theory usually provides. 



2 Background 

2.1 Domain Theory 

A poset is a partially ordered set 

Definition 1. A least element in a poset (P, C) is an element T G P such that 
T IZ X for all x G P. Such an element is unique. An element x G P is maximal 
if (Vy GP)xZy=>x = y. The set of maximal elements in a poset is written 
max P. 
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Definition 2. Let (P, T) be a poset. A nonempty subset S' C P is directed if 
(Vx,y G S)(32 £ S) x,y ^ z. The supremum of a subset S C P is the least of 
all its upper bounds provided it exists. This is written |J S. A dcpo is a poset in 
which every directed subset has a supremum. 

Definition 3. In a poset (P, T), a <C a: iff for all directed subsets S Q D which 
have a supremum, x T |J S (3s G S) a T s. We set = {a G P : a <C x}. An 
element x G P is compact if x <C x. The set of compact elements in P is K{P). 

Definition 4. A subset P of a poset P is a basis for PUB fl ,|,x contains a 
directed subset with supremum x, for each x G P. 

Definition 5. A poset is continuous if it has a basis. A poset is algebraic if its 
compact elements form a basis. A poset is u>- continuous if it has a countable 
basis. 

Definition 6. A domain is a continuous dcpo. 

Definition 7. A subset P of a poset P is Scott open if 

(i) U is an upper set: xGP ^ x\£y ^ y £U, and 

(ii) U is inaccessible by directed suprema: For every directed S C P with a 
supremum, 

y S' G P ^ S' n P yf 0. 

The collection of all Scott open subsets of P is called the Scott topology. It is 
denoted ap. 

Unless explicitly stated otherwise, all topological statements about posets are 
made with respect to the Scott topology. 

Proposition 1. A function f : D ^ E between dcpos is continuous iff 

(i) / is monotone: x P y /(x) P f{y)- 

(ii) f preserves directed suprema: For all directed S C P,/(|JS) = LJ/('^)- 

2.2 Examples of Domains 

Example 1. The interval domain is the collection of compact intervals of the real 
line 

IR = { [a, 6] : a, & G R & a < 6} 
ordered under reverse inclusion 

[a, b] P [c, d] [c, d] C [a, b] 

is an w-continuous dcpo. The supremum of a directed set S C IR is P| S, while 
the approximation relation is characterized by / <C J J C int(/). A countable 
basis for IR is given by {[p, g] : p, g G Q & p < g}. 




118 



K. Martin 



Definition 8. A partial function f : X ^ Y between sets X and F is a function 
f : A ^ Y defined on a subset A C X. We write dom(/) = A for the domain of 
a partial map f : X ^ Y. 



Example 2. The set of partial mappings on the naturals 

[N-N] ={/!/: N-N} 

becomes an w-algebraic dcpo when ordered by extension 

f dom(/) C dom( 5 ) t f = g on dom(/). 

The supremum of a directed set S' C [N — >■ N] is IJ S, under the view that 
functions are certain subsets of N x N, while the approximation relation is 

& dom(/) is finite. 

The maximal elements of [N — 1 N] are the total functions, that is, those functions 
/ with dom(/) = N. 



Example 3. The Cantor set model is the collection of functions 

E°° = { s I s : {1, . . . , n} — 1 {0, 1}, 0 < n < oo } 

is also an w-algebraic dcpo under the extension order 

s C t |s| < |t| & ( V 1 < t < |s| ) s(i) = t{i), 

where |s| is written for the cardinality of dom(s). The supremum of a directed 
set S C E°° is U S, while the approximation relation is 

& |s| < oo. 

The extension order in this special case is usually called the prefix order. The 
elements s £ E°° are called strings over {0, 1}. The quantity |s| is called the 
length of a string s. The empty string e is the unique string with length zero. It 
is the least element _L of E°°. 



Example 4- If AT is a locally compact Hausdorff space, then its upper space 
UA = {9 ^ K C X : K is compact} 
ordered under reverse inclusion 



AC B B C A 

is a continuous dcpo. The supremum of a directed set A C UA is p| S' and the 
approximation relation is A <C S S C int(A). 
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Example 5. Given a metric space {X, d), the formal ball model |2| 

BX = Xx [0,oo) 



is a poset when ordered via 

(x, r) C (y, s) 4^ d(x, y) < r - s. 

The approximation relation is characterized by 

(x, r) <C (y, s) d(x, y) < r — s. 

The poset BX is continuous. However, BX is a dcpo iff the metric d is complete. 
In addition, BX has a countable basis iff X is a separable metric space. 

3 Measurement 

The set [0, oo)* is the domain of nonnegative reals in their opposite order. 

Definition 9. A Scott continuous map fj, : D ^ [0,oo)* on a continuous dcpo 
D induces the Scott topology near X <Z D ii for all Scott open sets U Q D and 
for any x G X, 

xGU^{3e>0)xG Q U, 

where = {y G D : y Q x Sz \yx — yy\ < e}. This is written y — ?> ux- 

Definition 10. A measurement on a domain Z? is a Scott continuous mapping 
y : D ^ [0, oo)* with y -G Cker/i where ker y = {x G D : yx = 0}. 

The most useful properties of measurements in applications are as follows. 

Proposition 2. If D is a domain with a measurement y — >■ ax^ then 

(i) For all X G D and yGX,xQySzyx = yy^x = y. 

(ii) For all x G D, yx = 0 => a; G maxD. 

(iii) For all x G X and any sequence (xn) in D with T x, if yx„ — >■ yx, then 
\_\xn = X, and this supremum converges in the Scott topology. 

Proof For (i), we prove that y Q x. Let G be a Scott open set around y. Then 
there is e > 0 with y G ye{y) Q U . But x Qy and yx = yy hence x G ydy) C U. 
Thus, every Scott open set around y also contains x, establishing y Q x. (ii) 
follows from (i). The proof of (iii) uses the same technique applied in (i) and 
may be found in |3|. □ 

Prop. El shows that measurements capture the essential characteristics of 
information content. For example, (i) says that comparable objects with the 
same amount of information are equal, (ii) says that an element with no disorder 
in it (no partiality) must be maximal in the information order, and (iii) says that 
if we measure an iterative process (xn) as computing an object x, then it actually 
does calculate x. The common theme in each case is this: Any observation made 
with a measurement is a reliable one. 
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Example 6. Domains and their standard measurements. 

(i) (IK, p) the interval domain with the length measurement /r[a, b\ = h — a. 

(ii) ( [N ^ N] , /r) the partial functions on the naturals with 



where | • | : Vuj — >■ [0,oo)* is the measurement on the algebraic lattice Vuj 



(iii) l/2l'l) the Cantor set model where | • | : S°° — >■ [0,oo] is the length of 
a string. 

(iv) (UX,diam) the upper space of a locally compact metric space {X, d) with 



(v) (BX, 7 t) the formal ball model of a complete metric space {X, d) with tt{x, r) 



In each example above, we have a measurement /x : Z? — >■ [0, oo)* on a domain 
with ker /i = maxU. In all cases except (iv), we also have fi — >■ ajj. In general, 
there are existence theorems 0 for countably based domains, which show that 
measurements usually exist. However, the value of the idea lies not in knowing 
that they exist abstractly, but in knowing that particular mappings, like the ones 
in the last example, are measurements. 

4 Fixed Points of Nonmonotonic Maps 

Definition 11. A splitting on a poset P is a function s : P — ?> P with x □ s(a;) 
for all X G P. 

Proposition 3. Let D be a domain with a measurement p, — >■ ao- If I Q D is 
closed under directed suprema and s : I ^ I is a splitting whose measure 



Hf = |dom(/)| 



given by 




diamlf = sup{d(a;,y) : x,y G K}. 



= r. 



^ o s : / — >• [0, oo)* 



is Scott continuous between dcpo ’s, then 




n>0 



Moreover, the set of fixed points fix(s) = {x G I : s{x) = x} is a dcpo. 
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Proof Let x G I. By induction, (s"(a;)) is an increasing sequence in I. The set / 
is closed under directed suprema hence Un>o *”( 2 ^) S I- Because s is a splitting, 
U„>oS"(2^) ^ s(U„>o s"(a;)), while the fact that fi o s and fj, are both Scott 
continuous allows us to compute 

/rs( I I s"(x)) = lim ^s"+^(a;) = /i( I I s"'(a;)). 

' — I n—^oo ' — I 

n>0 n>0 



By Prop 0 however, two comparable elements whose measures agree must in 
fact be equal. Hence, 

n>0 n>0 

To show that fix(s) is a dcpo one need only prove closure under suprema of 
sequences because /i — >■ an P|. The proof for sequences, however, uses the very 
same methods employed above and is entirely trivial. □ 

Example 7. Let / : R — ?> K. be a continuous map on the real line. Denote by 
C(/) the subset of IR where / changes sign, that is, 

C(/) = {[a,6] :/(a)-/(6)<0}. 



The continuity of / ensures that this set is closed under directed suprema, and 
the mapping 

splits :C(/)^C(/) 



given by 

split, |„,i,| = I ‘''‘[“■'’I , ^ 

’ j [ right [a, otherwise. 



is a splitting where left[a, h] = [a, (a + h)/2] and right[a, 6] = [(a + &)/2, h]. The 
measure of this mapping 



/rsplity[a,6] = 

is Scott continuous, so Proposition 0 implies that 

I I split^[a,6] S fix(splitj). 

n>0 



However, fix(splitji) = {[r] : f(r) = 0}, which means that iterating splitj is 
a scheme for calculating a solution of the equation f{x) = 0. This numerical 
technique is called the bisection method. 

The major fixed point technique in classical domain theory, the Scott fixed point 
theorem, cannot be used to establish the correctness of the bisection method: 
split j: is only monotone in computationally irrelevant cases. 
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Proposition 4. For a continuous selfmap / : K — i R which has at least one 
zero, the following are equivalent: 

(i) The map split^ is monotone. 

(ii) The map f has a unique zero r and 

C'(/) = {[a,r] : a<r}U {[r,b] :r<b}. 

Proof We prove (i) (ii). Let a < /3 be two distinct roots of /. Then by 

monotonicity of split f , 

splitj[a,/3] C split^[/3] = [(3], 

for all n > 0. Then [a] = [J splitj [a, /?] C [(}], which proves a = f3. Thus, / has 
a unique zero r. 

Now let [a, b] € C{f) with a < r < b and set S = max{r — a,b — r} > 0. Then 
r — S<a<b<r + 6. By the uniqueness of r, 

fir - • /(a) > 0 and f{b) ■ f{r + <5) > 0, 

and since [a, 6] G C{f), we have y := [r — <5, r + 5] G C(/). For the very same 
reason, x := [r — 5 — 5/2,r + 5 + 5/4\ G C{f). But then we have xlTy and 

split^x =[r — S/8, r + S + <5/4] ^ [r — 5,r] = split^y, 

which means split is not monotone if / changes sign on an interval which 
contains r in its interior. □ 

That is, if splity is monotone, then in order to calculate the solution r of 
f{x) = 0 using the bisection method, we must first know the solution r. 

Example 8. A function / : [a, 6] — 1 R is unimodal if it has a maximum value 
assumed at a unique point x* G [a, b] such that 

(i) / is strictly increasing on [a,x*], and 

(ii) / is strictly decreasing on [x* , b] . 

Unimodal functions have the important property that 

^ ^ ^ < a:* < 6 if /(xi) < /(xa), 

^ ^ ( a < X* < X 2 otherwise. 

This observation leads to an algorithm for computing x*. For a unimodal map 
/ : [a, 6] — >■ R with maximizer x* G [a, 6] and a constant 1/2 < r < 1, define a 
dcpo by 

Ix* = {x G IR : [a, 6] U X U [a;*]}, 

and a splitting by 



max/ [o, b] 



max/ : /j,. 

[;(o, b),b] if f(l{a, b)) < f{r{a, 6)); 
[a, r{a, 6)] otherwise. 
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where l{a,b) = (6 — a)(l — r) +a and r(a, b) = (b—a)r + a. The measure of maxj 
is Scott continuous since /xmaxy(a:) = r • /r(a;), for all x € ■ By Propositional 

I I max" (a;) G fix(max/), 

n>0 

for any x G However, any fixed point of maxy has measure zero, and the 
only element of Ix* with measure zero is [x*]. Thus, |Jmaxj[a, 6] = [x*], which 
means that iterating maxy yields a method for calculating x*. This technique is 
called the r-section search. 

Finally, observe that maxy is not monotone. Let -1 < a < 1 and /(x) = 
1 — x^. The function / is unimodal on any compact interval. Since maxy [-1, 1] = 
[-1, 2r — 1], we see that 

maxy[-l, 1] C maxy[a, 1] => 1 < 2r — 1 or r(a, 1) < 2r — 1 

=>l<rora+l< r{a + 1) 

=> r > 1, 

which contradicts r < 1. Thus, for no value of r is the algorithm monotone. 

As further evidence of its applicability, notice that Prop.Olalso implies the Scott 
fixed point theorem for domains with measurements /i — >■ ctd. 

Example 9. li f : D ^ D is a, Scott continuous map on a domain D with a 
measurement fi — ^ ao, then we consider its restriction to the set of points where 
it improves 

/(/) = {x€ D :xQ /(x)}. 

This evidently yields a splitting / : /(/) — >• /(/) on a dcpo with continuous 
measure. By Proposition El 

(Vx G /(/)) |_J f^{x) is a fixed point of /. 

ra >0 



For instance, if D is w-continuous with basis {6„ : n G N}, then 

ytx = \{n : bn <C x}| 

defines a measurement fi — >■ ar>. Notice, however, that with this construction we 
normally have keryt = 0. 

5 Unique Fixed Points of Monotonic Maps 

In the last section, we saw that measurement can be used to generalize the Scott 
fixed point theorem so as to include important nonmonotonic processes. Now we 
see that in can improve upon it for monotone maps as well, by giving a technique 
that guarantees unique fixed points. 
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Definition 12. Let I? be a continuous dcpo with a measurement /i. A monotone 
map / : D — 1 D is a contraction if there is a constant c < 1 with 

t^f{x) < c - fix 

for all X G D. 

Theorem 1. Let D he a domain with a measurement /x such that 
( V X, y S ker /x)(3zG D ) z Q x,y. 

If f : D ^ D is a contraction and there is a point x G D with x Q f(x), then 

a;* = 1^ f'^(x) € maxD 

n>0 

is the unique fixed point of f on D. Furthermore, x* is an attractor in two 
different senses: 

(i) For all x G ker/x, /"(a;) -G- x* in the Scott topology on ker/x, and 

(ii) For all x C x*, [Jn>o /”(^) = supremum is a limit in the Scott 

topology on D. 

Proof First, for any x G D and any n > 0, /x/”(a;) < c”/xx, as is easy to prove by 
induction. Given a point x Q f{x), the monotonicity of / implies the sequence 
(/"(a;)) is increasing, while the continuity of /x allows us to compute 

/x(| |/"(a;))= lim /x/”(x) < lim c"/xx = 0. 

Hence, x* = un x) G ker/x C maxD. But the monotonicity of / also gives 
X* E f{x*)- Hence, x* = f{x*) is a fixed point of /. We will prove its uniqueness 
after (ii). 

For (ii), let x Q x*. By the monotonicity of /, 

(Vn > 0) /"(a;) C f{x*) = x*, 

and since lim/x/”(a;) = pLX* = 0, the fact that /x is a measurement yields 

\_\r{x) = x\ 

n>0 

Now let a;* be any fixed point of /. Then x* G ker /x so there is z G D with 
z E x^,, X* . By (ii), |J f^{z) = x* = x*. Thus, the fixed point x* is unique. 

For (i), let x G ker/x. Then /"(x) G ker/x for all n > 0. In addition, there is 
an a E x,x*, so f"(a) E f^(x),x*. Now let f7 be a Scott open set around x*. 
Because /x is a measurement, 

(3e > 0) X* G /Xe(x*) C U. 

Since y,f"'{a) — >■ 0, all but a finite number of the f"'{a) are in U. But U is an 
upper set, so the same is true of the /"(x). Hence, /"(x) — 1 x*, in the Scott 
topology on ker /x. □ 

When a domain has a least element, the last result is easier to state. 
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Corollary 1. Let D be a domain with least element _L and measurement If 
f : D ^ D is a eontraetion, then 

a:* = |_J /"(-L) G maxZ? 

n>0 

is the unique fixed point of f on D. In addition, the other conclusions of Theo- 
rem^hold as well. 

All the domains that we have considered in this paper have the property that 
(Va;, y G D){3z & D) z ^ x,y, and so Theorem dean be applied to them. 

Example 10. Let / : X — >■ A be a contraction on a complete metric space X 
with Lipschitz constant c < 1. The mapping / : X — >■ X extends to a monotone 
map on the formal ball model / : BX — >■ BX given by 

f{x,r) = (fx,c-r), 



which satisfies 

Trf{x, r) = c • 7r(a:, r), 

where tt : BX — >■ [0, oo)* is the standard measurement on BX, 7r(a;, r) = r. Now 
choose r so that (x,r) T f{x,r). By Theorem^, / has a unique attractor which 
implies that / does also because X ~ ker tt. 



We can also use the upper space (UX,diam) to prove the Banach contraction 
theorem for compact metric spaces by applying the technique of the last example. 



Example 11. Consider the well-known functional 



m{k) = 



1 if A: = 0, 

kf{k — l)ifA:>l&A: — iG dom /. 

which is easily seen to be monotone. Applying y : [N ^ N] 
compute 

= |dom(</)(/))| 

1 



[0,oo)*, we 
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which means (j) is a, contraction on the domain [N ^ N]. By the contraction 
principle, 

□ = fac 

nGN 

is the unique fixed point of on [N ^ N], where _L is the function defined 
nowhere. 

One wonders here about the potential for replacing metric space semantics with 
an approach based on measurement and contractions. 



6 Closing Remarks 

There are many ideas left from the present discussion on measurement. Among 
the most fundamental are the /i topology, the study of the topological structure 
of ker /i, a discussion of how one extends measurements to higher order domains, 
and the informatic derivative (the derivative of a map on a domain with respect 
to a measurement). All of this can be found on the author’s webpage in |^. 
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Abstract. Distributed software systems are typically built according 
to a three layer conceptual structure: Objects on the lowest layer are 
clustered by components on the second layer, which themselves are lo- 
cated at nodes of a computer network on the third layer. Orthogonal 
to these three layers, an instance level and a type or schema level are 
distinguished when modeling these systems. Accordingly, the changes a 
system experiences during its lifetime can be classified as the system’s 
dynamic behavior on the instance level and as the evolution of the sys- 
tem on the schema level. This paper shows how concepts from the area of 
graph transformation can be applied to provide a conceptual and formal 
framework for describing the structural and behavioral aspects of such 
systems. 

Keywords: typed graph transformation, hierarchical graphs, system 
modeling, model evolution 



1 Introduction 

The structure and characteristics of software systems have changed dramatically 
during the last four decades. Starting with small programs in the 1960s and con- 
tinuing with large monolithic systems in the 1970s, the 1980s faced a development 
towards hierarchically structured subsystems or modules. Today, software sys- 
tems represent complex, often dynamic networks of interacting components or 
agents. In order to adapt to changing requirements and application contexts, and 
supported by modern programming language features (as, e.g., Java RMI m), 
middleware standards (like CORBA or DCOM [2817^ 1. and world-wide connec- 
tivity, these components may evolve over time, and they may be down-loaded 
and linked together while the system is executing. 

Beside the development from monolithic and static towards distributed and 
mobile applications, the process of software development has also changed. To- 
day’s software development teams consist of developers with different expertise, 
skills, and responsibilities. Due to resource management, organizational, or ad- 
ministrative reasons, the development process itself may be distributed. More- 
over, since complex systems can no longer be built from scratch, an incremental 
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software development process is employed which is structured in several phases 
like, for example, a requirements analysis, design, and implementation phase (cf. 

E3)- 

A major concern for the success of such a distributed, incremental process is 
the maintenance of consistency in different dimensions: for instance, between the 
behavior of different components of the system, between the concepts used by 
different development teams in the project, or between the artifacts of different 
phases of development. 

An important means for achieving consistency in all dimensions is to build 
a model. It can, for instance, facilitate formal verification of component inter- 
action, form the basis for communication between teams, and serve as project 
documentation where all relevant decisions are documented in order to trace 
design choices between different phases of development. In particular, a require- 
ments speeifieation doeument is indispensable in which the functionality of the 
software system to be built is stated. This document may represent a contract 
between the customer and the software development team as well as a contract 
within the team between software engineers being responsible for the analysis 
and design tasks and the programmers being responsible for a correct and effi- 
cient implementation. 

Thus, the model has to be written in a language which is intuitively and easily 
understandable by customers, in general not being computer scientists, as well 
as by software engineers and programmers, hopefully being computer scientists. 
Particularly, this means that the used speeifieation or modeling language has to 
provide language features which are on an appropriate abstraction level. Thus, 
the language has to offer support to abstract from programming or even machine- 
dependent details on the one hand and from unimportant real-world details on 
the other hand (cf. Fig. ^1. 




Fig. 1. Role of a system model. 



Often, informal diagrammatic modeling languages like the Unified Modeling 
Language (UML) are considered the first choice from this point of view: 
They are visual and apparently easy to understand and, due to the use of var- 
ious diagrams and special-purpose notations for different aspects, they provide 
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good support for high-level specifications. However, as we can judge from our 
own experience in teaching and modeling in UML, the language still lacks a clear 
underlying conceptual model of the systems to be modeled EOl. But, an agree- 
ment on such a conceptual model has to precede the definition of syntax and 
semantics of a language, as only on this base appropriate language features, a 
precise semantics, and adequate pragmatic guidelines on how to use the language 
can be given. 

Therefore, the aim of this paper is twofold. First, we want to contribute to 
the understanding of the conceptual model underlying today’s software systems. 
This shall be done by presenting a typical example of a mobile, distributed appli- 
cation and by identifying three relevant dimensions. Second, we shall provide a 
formalization of the conceptual model using concepts and results from the theory 
of graph transformation systems. The basic idea is to model the system’s states 
as graphs, and to specify its dynamic change by transformation rules. 

In the theory of semantics, various formalisms have been developed for the 
operational specification of concurrent, distributed and mobile systems. Most 
of them are based on a notion of state much simpler than graphs. Among the 
numerous term-based approaches we only mention process calculi like 
building on labeled transition systems and the Structured Operational Semantics 
(SOS) paradigm and approaches based on rewriting systems like j41l4IJ . In 
Petri net-based approaches m the states of a system are modeled as sets or 
multi-sets. These formalisms have been quite successful in modeling relevant 
aspects of software systems, and they have developed a large body of theory 
which can aid the developer in structuring, understanding, and verifying her 
specifications. 

Still, we believe that graphs and graph transformation are indispensable 
means for providing an operational, conceptual and formal framework for to- 
day’s software development. First, the states of object-oriented, distributed, and 
mobile systems are most naturally represented as graphs modeling, e.g., object 
structures, software architectures, network topologies, etc. Then, for describing 
changes to these states like the insertion or deletion of objects or links, archi- 
tectural or network reconfiguration, graphs have to be manipulated. Second, in 
diagrammatic modeling languages, the abstract syntax of an individual model 
is usually represented as a graph instead of a term or tree as in (textual) pro- 
gramming languages. Thus, for translating one diagram language into another, 
for defining the semantics of diagrams by reducing them to a normal form or by 
animating them, graph-based techniques are needed where, in the case of textual 
languages, term rewrite systems or SOS specifications may be used. 

In this paper, we will be interested in the first issue of interpreting graphs 
as states and graph transformation systems as “programs” . The second idea of 
graphs as representing diagrams and graph transformation systems defining their 
operational semantics is dealt with, for example, in |?SI 1 9j . In the next section, we 
will discuss a simple application scenario of a distributed and mobile system. This 
will serve for identifying the relevant dimensions of a conceptual framework for 
expressing structural and behavioral aspects as well as the evolution of today’s 
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software systems. Based on these investigations, in Sect. 0 we will use graph 
transformation systems to formalize our conceptual framework making explicit 
the consistency and orthogonality of its different dimensions. We close with a 
summary and a discussion of further aspects of software development which are 
not covered in the paper. 



2 A Conceptual Model for Distributed and Mobile 
Application 

In this section, we discuss and analyze a concrete application scenario of a dis- 
tributed system with mobile hardware components in order to identify the rele- 
vant concepts and dimensions in modeling such systems. This conceptual model 
shall provide the basis for the formal approach developed in Sect. El 

Our application scenario is (a simplified version of) a distributed system from 
the financial domain which uses Java cards (i.e., smartcards with a micropro- 
cessor supporting the JavaCard runtime environment PS)) to handle financial 
transactions (like the paying of bills) which require the coordination of services 
from different providers (e.g., a shop and a bank). A customer possessing such a 
smartcard may for instance, after purchasing some goods in a shop, use this card 
to download a corresponding bill from a cash box. Later on, she may pay her bills 
at a banking terminal. This terminal is one out of several hundreds, connected 
via an intranet connection to the mainframe of the bank. Thus, the scenario 
involves four different types of hardware components: cash boxes, smartcards, 
banking terminals, and mainframes (cf. Fig. n. Each hardware component has 
a processor and memory, and can host and execute software components. 




Fig. 2. Sample application scenario. 




Graph Transformation as a Conceptual and Formal Framework 131 



2.1 Hierarchical Structures 

Fig.0 shows a snapshot of the hardware architecture of the scenario on a more 
abstract level. The diagram depicts five hardware components where the Smart- 
Card component is linked to one of two BankingTerminal components which are 
both connected to a MainFrame. The diagram is shown in a UML-like notation 
m, where instances i of class C are represented as i:C (or simply :C if the identity 
is not of interest). The boxes are an iconic notation for hardware components 
(called nodes) as it is known from UML deployment diagrams. 

This hardware architecture forms the base layer of any running software 
system. Each node is equipped with some software components, which interact 
with each other locally (at one node) or remotely (between different nodes). 
These software components form the second layer of a software system. 




Fig. 3. Hardware architecture of the application scenario (snapshot). 



The allocation of software components to nodes can be described by UML 
deployment diagrams. A sample diagram is shown in Fig. El where the Smart- 
Card hosts a BillCard component responsible for accepting a bill issued by the 
Billing component of a CashBox as well as for storing this bill and transferring 
it to a BankingTerminal for payment. Notice that software components may be 
temporarily linked even if they reside on different nodes. A node may also host 
several different software components as, for instance, the SmartCard which car- 
ries a separate AuthCard component responsible for the authentication of the 
card owner. 

From a technical point of view, these diagrams are graphs with labeled (and 
attributed) vertices and edges. The deployment diagram in Fig. El can even be 



132 



G. Engels and R. Heckel 




Fig. 4. Deployment diagram. 



viewed as a kind of hierarchical graph where vertices representing hardware nodes 
contain vertices representing software components. Due to the links between 
software components running on different nodes, however, this hierarchy is not 
strict, as sublayers of different subgraphs may be directly interconnected. 

On top of the hardware architecture and component layers, a third layer can 
be identified. These are the problem- domain objects (the objects doing the ac- 
tual work) within software components. Fig. Elprovides an extended deployment 
diagram which, in addition, depicts problem-domain objects like the :Bill and 
iCustomer located within the Billing component. 



:CashBox 



software 

component 



problem domain . 
objects 



; Billing 






:Bill 


Pays 


: Customer 


amount = a 













processor 



Fig. 5. Deployment diagram extended by problem-domain objects. 
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That is, our conceptual framework for system modeling shall be based on 
hierarchical graphs where at least three layers have to be distinguished (cf. Fig. 
inj. Moreover, we often encounter hierarchical structures even within any of 
the three layers of objects, software components, and hardware architecture. 
This multi-layered hierarchical graph structure forms the first dimension of our 
conceptual framework. 

2.2 Typed Structures 

All diagrams discussed so far have shown individual snapshots of the application 
scenario. In the world of programming, this corresponds to a state in the execu- 
tion of a program which is specified in the program text by means of variables 
and types. Analogously, in modeling we also distinguish between an instance and 
a type level and demand that the instance level is consistent with the type level. 

In the layer of problem-domain objects, the type level is provided by a class 
diagram as depicted in Fig. 0 It defines three different classes and their associa- 
tions where the multiplicity is restricted by appropriate cardinality constraints. 
Classes are described by their names and by attribute definitions, whereas in- 
stances of a class will have concrete values for these attributes. 




Fig. 6. Problem-domain class diagram. 



Analogously, the allowed software and hardware architectures on the instance 
level may be determined by appropriate type definitions. Type definitions for 
software architectures, also called architectural styles ISH, determine a set of 
admissible concrete architectures. Fig. 0 gives an example for the definition of 
hardware architectural style using a hardware diagram on the type level. The 
notation is similar to that of the class diagram in Fig. 0 but for the use of 
the node icon. The xor-constraint ensures that a SmartCard instance is either 
connected to a CashBox or to a BankingTerminal. 

Hence, beside the hierarchical structure of the model we have identified a 
second, orthogonal dimension of typing. The states of our conceptual model are 



134 



G. Engels and R. Heckel 




Fig. 7. Hardware architecture (type level). 



therefore multi-layered, non-strict hierarchical graphs typed over correspond- 
ingly structured type graphs. 

2.3 Instance Level Dynamics 

So far, we have only dealt with the structural aspects of our model. In order to 
provide semantics for distributed and mobile applications with dynamic recon- 
figuration we have to specify operations for transforming this structure. 

Dynamic change on the instance level is specified by transformation rules as 
shown in Fig.0 A rule consists of two object diagrams, the left- and the right- 
hand side, where the former specifies the situation before the operation and the 
latter the situation afterwards. The rule in Fig. describes the operation of a 
customer paying a bill from his account by transferring the required amount to 
the specified target account. The transformation of object structures has to be 
consistent with the typing in the sense that the constraints on the instance level 
imposed by the type-level diagrams are preserved. 

The object transformation shown in Fig.|^can be seen as an abstract require- 
ment specification of the operation payBill disregarding the constraints imposed 
by the architectural layer. According to our scenario. Account and Bill objects 
originally reside on different nodes. Therefore, the bill has to be transferred from 
the cash box to the banking system by means of a SmartCard. The operations 
required for downloading the bill onto the card are described in Fig. 0 When 
the card is inserted into the cash box’s card reader, a hardware connection is 
established. This triggers, after completion of an appropriate authentication pro- 
tocol, the connection of the Billing with the BillCard component. Then, the bill 
is stored in the BillCard component and the customer’s identity is recorded by 
the Billing component of the cash box. 

As indicated by the identities of the Customer and the Bill objects, these 
objects are not simply copied from one component to the other but they are 
actually shared after the operation. This is physically impossible, but object- 
oriented middle-ware like CORBA m supports the sharing of objects (as well 
as references between objects on different machines) through the concept of 
proxies, i.e., local place-holders providing access to remote objects. As conceptual 
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payBill(b) 





Fig. 8. Object transformation rule. 



framework for the application developer, our model should provide features (at 
least) on the same level of abstraction as offered by state-of-the-art programming 
technology. Of course, in order to allow for sharing on the instance level, also on 
the type level it must be possible to share classes between different components 
types. 

Notice that we have described dynamic changes within all three layers of our 
hierarchy and all but the first transformation are concerned with two layers at 
the same time. Thus, not only the states of our systems are hierarchical, but 
also the operations have to take care of the hierarchical structure, as they are 
potentially not restricted to a single layer. 



2.4 Type Level Evolution 

The last point to be discussed in this section is the difference between the dy- 
namic change on the instance level as described above and the evolution of the 
system by changes on the type level. As systems have to be adapted to new 
requirements, not only their configuration may change, but also it may be nec- 
essary to introduce new types of hardware or to deploy new software components 
containing objects of classes which have not been known before. 

In our application scenario, a new component CashCard is downloaded on 
the card in order to provide the additional service of using the card directly 
for paying bills at a cash box. A corresponding Cashing component is needed 
at cash boxes while banking terminals have to provide a CashCardService to 
transfer virtual money to the card. On the hardware level, the evolution consists 
in establishing a new kind of hardware connection in order to transfer bills from 
the cash box to the banking system via the internet. The new types are shown 
in the diagram of Fig. O 
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Fig. 9. Transformation of the instance level hierarchy. 
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Fig. 10. Evolution of hardware and software architecture. 



The overall framework is summarized in Fig. II 1 1 It distinguishes between an 
instance level and a type level. Within each level, a hierarchical structure of three 
layers can be identified. For instance on the instance level, we have the problem- 
domain objects on the first layer, which are distributed over a component-based 
software architecture on the second layer, where software components are spread 
over a network of hardware components on the third layer. Orthogonally to 
these two dimensions of typing and hierarchical structures, we consider, as a 
third dimension, the dynamic change on the instance level and the evolutionary 
change on the type level, both in all of the three layers. The formalization and 
consistent integration of these three dimensions is the subject of the next section. 

3 Formalizing the Conceptual Model 
with Graph Transformation Systems 

In Sect. 121 we have analyzed the structures relevant for the modeling of dis- 
tributed and mobile applications. In retrospect, in a single model we can iden- 
tify the three dimensions of typing, dynamic and evolutionary change, and of the 
hierarchy of objects, software and hardware components. In this section, these 
three dimensions as well as their interdependencies shall be formalized. 

We build upon concepts from the theory of graph transformation systems 
which focus on the specification of systems where states are represented as graphs 
and changes are specified by transformation rules (thus dynamic change is al- 
ready present). Surveys on the state-of-the-art of theory and application of graph 
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Fig. 11. Instance and type level as well as the three-layered hierarchy of system models. 



transformation can be found in the three handbook volumes |47ll4lib| . A good 
introductory text is p. 

Among the various formalizations of graph transformation, the “algebraic, 
Double PushOut (DPO) approach” | 18 IU)j is one of the most successful, mainly 
because of its flexibility. In fact, since the basic definitions of rule and trans- 
formation are based on diagrams and constructions in a category, they can be 
used in a uniform way for a wide range of structures. Therefore, many results 
can be proved once and for all using categorical techniques ca This flexibility 
shall be exploited in the following in order to augment the original formalism 
IT^ with several levels of typing (in the spirit of the typed DPO approach 0), 
thus representing the hierarchy within a model and its evolution on the type 
level. 



3.1 Object Dynamics as Typed Graph Transformation 

In this section, we are going to provide the formal semantics of the dynamic 
change at the instance level conceptually discussed in Sect. F..3L This includes 
the concepts of typing as described in Sect. 12. 21 as well as the transformation of 
instances in a type-consistent way. 

Graphs and typed graphs. The relation between diagrams on the type level and 
diagrams on the instance level is formally captured by the concept of type and 
instance graphs |2|. 

By graphs we mean directed unlabeled graphs G = {Gv,GE,src^ 
with set of vertices Gy, set of edges Ge, and functions src^ : Ge — t Gy and 
tar^ : Ge ^ Gy associating with each edge its source and target vertex. A 
graph homomorphism / : G — ?► is a pair of functions {fy : Gy — >■ Hy , Je '■ 

Ge He) preserving source and target, that is, src^ofE = fyosrc^ and tar^o 
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Je = fv Otar‘S. The category of graphs and graph morphisms with composition 
and identities defined componentwise on the underlying functions is denoted by 

Graph. 

Definition 1 (typed graphs). Given a graph TG, called type graph, a TG- 
typed (instance) graph consists of a graph G together with a typing homomor- 
phism g : G ^ TG associating with each vertex and edge x of G its type g{x) = t 
in TG. In this case, we also write x : t G G. A TG-typed graph morphism between 
two TG-typed instance graphs {G,g) and {H, h) is a graph morphism f : G ^ H 
which preserves types, that is, h o f = g. The category of all instance graphs 
typed over TG is denoted by Graph^.^,. 



/ 




TG TG 



Categorically speaking, Graphy,^ is the comma category (Graph f TG) of 
graphs over TG (see, e.g., m)- This observation allows to inherit categorical 
constructions from Graph to Graphy( 5 . In particular, since the category of 
graphs has all limits and colimits, the comma category Graph^g has all limits 
and colimits as well, and their constructions coincide in Graph and Graph^.^. 
up to the additional typing information. 

Example 1 (type and instance graphs). Fig. [T^ shows the class diagram of Fig. 0 
as a type graph, as well as a corresponding instance graph. We use the UML- 
like notation vertex:type for specifying instance graphs and their typing. The 
name of the vertex may be omitted in which case the diagram represents an 
isomorphism class of graphs. We do not formally deal with attributes in this 
paper. For approaches to the transformation of attributed graphs the reader is 
referred to EHEHEI. 

Typed graph transformations. The DPO approach to graph transformation has 
originally been developed for vertex- and edge-labeled graphs UHl- Here, we 
present immediately the typed version jO]. 

According to the DPO approach, graph transformation rules (also called 
graph productions), are specified by pairs of injective graph morphisms (L 
K R), called rule spans. The left-hand side L contains the items that must be 
present for an application of the rule, the right-hand side R those that are present 
afterwards, and the context graph K specifies the “gluing items” , i.e., the objects 
which are read during application, but are not consumed. The transformation 
of graphs is defined by a pair of pushout diagrams, a so-called double pushout. 

Definition 2 (DPO graph transformation). Given a type graph TG, a TG- 
typed graph transformation rule is a pair p : s consisting of a rule name p and a 
rule span s = {L -G— K R). 
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Fig. 12. Type and instance graph. 



A double-pushout (DPO) diagram o is a diagram like below where (1) and 
(2) are pushouts. Given a rule p : s like above the corresponding (direct DPO) 



transformation from G to H is denoted by G 

I 

K 



p/o 



H. 




The DPO diagram o is a categorical way of representing the occurrence of a 
rule in a bigger context. Assuming a simpler representation of rules, the same 
situation can also be expressed in a more set-theoretic way. In fact, a graph 
transformation rule can be represented (up to renaming) as a pair of graphs 
with rule name p : L ^ R such that the union L U i? is defined (i.e., graphs L 
and R live in the same name space). The span representation can be recovered 
asp: L ^ L D R ^ R. That is, given that L U i? is defined, the interface graph 
K, which is needed in the categorical setting for specifying the sharing between 
L and R, can be reconstructed and is therefore omitted. 

Then, a direct transformation from G to H (with G L) H defined) using rule 
p : T — 1 i? is given by a graph morphism o:LUR^GUH, called occurrence, 
such that 

— o{L) C G and o{R) C H, i.e., the left-hand side of the rule is embedded into 
the pre-state and the right-hand side into the post-state 
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— o{L\R) = G\H and o{R\L) = H\G, i.e., exactly that part of G is deleted 
which is matched by elements of L not belonging to R and, symmetrically, 
that part of H has been added which is matched by elements new in R. 

Like in the categorical definition, the resulting graph H is only determined up 
to isomorphism by the rule p \ L ^ R and the occurrence ol '■ L ^ G. 

Example 2 (typed graph transformation). Fig. II ,41 shows the span representation 
of the object transformation rule of Fig. 0as well as its application to the instance 
graph in Fig. [Q Operationally speaking, the application of the rule proceeds 
as follows. Given the occurrence of the left-hand-side in the given graph G 
determined by the mapping of the b:Bill vertex, the application consists of two 
steps: The objects of G matched by L \ 1{K) are removed which leads to the 
graph D without the bill. Then, the objects matched by i? \ r{K) are added to 
D leading to the derived graph H. 

Gluing the graphs L and D over their common part K yields again the given 
graph G, i.e., the left-hand square (1) forms a so-called pushout complement. 
Only in this case the application is permitted. Similarly, the derived graph H is 
the gluing of D and R over K, which forms the right-hand side pushout square 
( 2 ). 

The same rewrite mechanism is used for the update of attributes: In order 
change an attribute value, the attribute is deleted and re-generated with the new 
value. 

The formalization of deletion as “inverse gluing” implies that the application 
of a rule can be described as an embedding of K — L C\ R into a context by 
means of the occurrence morphism ok ■ K ^ D (cf. Fig. nTHl . This implies that 
only vertices in the image of L(1 R can be merged or connected to edges in the 
context. These observations are reflected, respectively, in the identification and 
the dangling condition of the DPO approach which characterize, given a rule 
p : {L i — K — ^ R) and an occurrence ol : L — >■ G of the left-hand sid^ 
the existence of the pushout complement (1), and hence of a direct derivatioiu 

G H . The identification condition states that objects from the left-hand side 
may only be identified by the match if they also belong to the interface (and are 
thus preserved). The dangling condition ensures that the structure D obtained 
by removing from G all objects that are to be deleted is indeed a graph, that is, 
no edges are left “dangling” without source or target node. 

In this way, the DPO construction ensures the consistency of the transfor- 
mation with the graph structure. Type consistency is ensured by the fact that 
the transformation rules themselves are well-typed. Together with the typing of 
the given graph G this induces the typing of the derived graph H . 



^ The pushout (2) always exists since category GraphxG is cocomplete due to the 
cocompleteness of Graph. 
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Fig. 13. DPO graph transformation step. 



3.2 Hierarchical Structures as Aggregated Graphs 

In Sect. 12., SI we have pointed out the requirements for modeling the dynamic 
change of the instance level hierarchy consisting of the object, software and 
hardware component layers where, in general, each of these layers may be again 
hierarchical. It has been observed that these hierarchies are not strict (neither on 
the type nor on the instance level) in the sense that vertical sharing is allowed, 
i.e., the refinement relation forms a directed acyclic graph (DAG) rather than a 
tree. In this section, we outline an approach to formalize the transformation of 
such structures. 

Hierarchical graphs and their transformation have received much attention 
in the graph transformation literature. The first formal account we are aware of 
is iS|. Close to the conceptual model outlined in Sect. f2. II is the approach of 
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distributed graph transformation m- Distributed graphs are hierarchical graphs 
with two layers only. Formally, they are diagrams in the category of graphs. That 
means, a graph G representing the network topology is attributed with graphs 
and graph morphisms modeling, respectively, the object structures contained in 
the local components and relationships between them. The transformation of 
these graphs is based on the double-pushout approach. As distributed graphs 
also allow sharing of objects between different components, they would be a 
natural candidate for formalizing our hierarchical models. However, they are 
restricted to hierarchies of depth two (although the construction of diagrams in 
the category of graphs could be iterated). 

A new approach towards hierarchical graph transformation based on the 
DPO transformation of hypergraphs (see, e.g., [E|) which allows hierarchical 
graphs of arbitrary finite depth is proposed in H2| Conceptually, hyperedges in 
this approach represent components which contain other hypergraphs, and the 
attachment vertices of a hyperedge represent ports through which the compo- 
nents are connected. The drawback of this approach from our point of view is 
its limitation to strict, tree-like hierarchies without vertical sharing. 

More general concepts of hierarchical graphs allowing both vertical sharing 
and multiple-layered structures have been proposed, for example, in |:^llt)j but 
neither of them formally accounts for the transformation of these graphs. Sum- 
marizing, none of the approaches we are aware of satisfies all the requirements 
of Sect. 121 although many approaches contribute to some aspect of the overall 
problem. 

Instead of extending the underlying formalism as it is done in most ap- 
proaches, we propose to model hierarchical graphs with vertical sharing by ag- 
gregated graphs^ i.e., graphs with distinguished aggregation edges representing 
the refinement of vertices. This approach has the advantage of preserving the 
theoretical results and semantic properties of the basic typed graph transforma- 
tion approach. In particular, the concurrent semantics in terms of concurrent 
traces, graph processes, and event structures (see 0 for a recent survey) is rele- 
vant to our aim of developing a formal model for distributed and mobile systems. 

For distinguishing between aggregation edges and ordinary edges in both 
type and instance graphs, we introduce an additional level of typing, the meta 
type level. In the meta type graph TGq in Fig. on the left two kinds of 
edges are defined: ordinary edges and aggregation edges with a diamond head 
designating the super-vertex. Since there is only one node representing all kinds 
of vertices, TGq is the most general type graph serving our purpose. It could be 
further refined by separating the three layers of objects, software and hardware 
components as shown in the same figure on the right. In fact, since the graph 
TGi on the right is itself typed over TGq, it could be used to provide another 
level of typing (that we omit for the sake of simplicity) . An analogous approach 
can be found in the standardization of the UML by the OMG PH based on the 
Meta Object Facility j22], a meta modeling framework which provides four levels 
of typing. 
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Fig. 14. Meta type graphs. 



Thus, graphs typed over TGq represent graphs with aggregation edges. Since 
the genuine type level is still missing, typed graphs with aggregation edges are 
defined by replacing in Definition [Dthe category of graphs and graph morphisms 
with the category Graph^g.^ of TGo-typed graphs. Given a TGo-typed graph 
{TG,tgo ■ TG — >■ TGq) representing a type graph with aggregation edges, we 
build the comma category Graphj.g.jj | (TG,tgo) of TGo-typed graphs over 
(TG, tgo). Objects of this category are graphs with two levels of typing compat- 
ible with the typing of TG as represented by the commutative diagram below 
on the left. 




A morphism of {TG ,tgo)-typed graphs (G,g) and {H, h) is a graph morphism / 
such that the diagram above on the right commutes. In particular, notice that 
both the meta type graph TGq and the type graph {TG,tgo) remain fixed, i.e., 
the only variation allowed is on the level of instances. 



Example 3 (typed graph with aggregation). Fig. |EI shows a type graph with ag- 
gregation edges which is itself an instance of the meta type graph TGi in Fig.Ol 
Notice the vertical sharing of the classes Customer and Bill between the Billing 
and BillCard components as well as the link between the two components con- 
tained in different nodes. 



The transformation of (TG, tgo)~typed graphs is defined by replacing in Def- 
inition El the category Graphing, for a given type graph TG with the category 
Graph.p( 3 ij | (TG,tgo) for a given TGo-typed type graph {TG,tgo). Due to 
the notion of morphism, the transformation is restricted to the instance level 
while the type level remains static. Notice again that the use of a comma cate- 
gory does not only allow us to inherit the categorical structure relevant for the 
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Fig. 15. A type graph with aggregation edges. 



definition of transformation, but also to reestablish most theoretical results of 
double-pushout graph transformation. 

Example 4 (graph transformation rule with aggregation). A graph transforma- 
tion rule conforming to the type graph of Fig. I131is shown in Fig. cni It represents 
the formal counterpart of the lower rule transferBill in Fig. El (the context graph 
is omitted). Here, the sharing of Bill and Customer between the Billing and the 
BillCard components, which has been noted already in the type graph, occurs on 
the level of instance graphs in the right-hand side of the rule rule. 

Although the concept of meta typing allows us to represent graphs with aggre- 
gation edges, additional constraints are required in order to exclude meaningless 
structures. In particular, we have to require the acyclicity of aggregation edges 
in the instance level graphs. Thus, given a TGo-typed type graph (TG,tgo), an 
aggregated graph is a {TG,tgQ)-typed instance graph such that the aggregation 
edges form a directed acyclic graph. Further, an aggregated graph transforma- 
tion rule is a rule in Graphj.g.jj f {TG,tgo) which does not introduce a new 
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Fig. 16. Graph transformation rule with aggregation. 
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aggregation edge between two already existing vertices. This is enough to show 
that the application of a rule on an aggregated graph yields again an aggregated 
graph (i.e., it does not create cycles of aggregation edges), thus ensuring the 
consistency between dynamic change and hierarchical structure. 

Of course, other constraints may be added, for example, about the connec- 
tions allowed between graphs within different nodes or between different layers 
of the hierarchy (see, e.g., E3). Such constraints, however, are highly applica- 
tion dependent and they should not be part of the basic formalism. Instead, 
a constraint language is required for specifying such constraints and verifying 
the consistency of rules. The most prominent logical approach for specifying 
graph properties is based on second-order monadic logic. In contrast to the 
first-order case, this allows the specification of path properties like the acyclicity 
constraints mentioned above. A less powerful formalism for expressing integrity 
constraints which is based on a graphical notation has been introduced in . 



3.3 Model Evolution through Type Level Transformation 



In the previous section, we have used meta typing in order to distinguish ordinary 
and aggregation edges representing refinement in hierarchical graphs. We have 
pointed out that, due to the notion of morphism between graphs with two levels 
of typing, the transformation is restricted to the instance level while the type 
level remains static. In this section, it is our aim to extend the transformation 
to the type level. To this aim, we provide another construction of graphs with 
two-level typing which yields the same notion of graphs but a more flexible kind 
of morphism. 

An arrow category C”*' in a given category C has arrows a : A — >■ A' of C 
as objects. A morphism between a : A — )> A' and b : B ^ B' is a, pair of arrows 
(/, /') in C such that the square below on the right commutes. 

A A— 

a a b 

Y If/'' 

A' A' ^ B' 

The arrow category Graph”*^ over the category of graphs has graph morphisms 
as objects. We may think of them as typing morphisms from instance graphs 
to type graphs. With this intuition, morphisms of Graph^^ consist of graph 
morphisms both on the instance and the type level thus allowing the variation 
of types as required for evolutionary change. 

In order to preserve the hierarchical structure established through meta typ- 
ing and to provide some control on the transformation of types, we build the 
arrow category over the category of T Go-typed graphs. Thus, objects of this 
category are arrows in Graphjn^^ like below on the left, and arrows are pairs of 
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arrows {fi^fx) in Graph^^.^ as shown on the right: 




Now we can define our ultimate notion of transformation, catering for hierar- 
chical structures as well as for type-level evolution, by replacing the category of 
TG- typed graphs in Definition El with the category Graphi^g^. 

Transformation rules in this setting consist of two-level graphs. They modify 
consistently the type and the instance level. The consistency between the two 
levels of transformation and the meta typing realizing the hierarchical structure 
is implicit in the commutativity of the above diagrams. 

A more general approach to system evolution has been presented in [tlS) . In 
addition to the evolution of type graphs, in also the transformation rules 
specifying dynamic change may be modified by application of higher-level evo- 
lution rules. Thus, unlike in our approach, dynamic change and evolution form 
orthogonal dimensions. 



4 Conclusion 



In this paper, we have proposed a conceptual framework for mobile and dis- 
tributed applications. Three orthogonal dimensions of such systems have been 
identified: A multi-layered, non-strict hierarchical graph structure, the typing of 
instance graphs over type graphs, and the dynamic and evolutionary change 
occurring, respectively, on the instance and the type level. This conceptual 
framework is formalized using concepts of graph transformation, in particu- 
lar, typed graph transformation systems according to the categorical double- 
pushout (DPO) approach 1 18191 and hierarchical graphs. The use of categorical 
constructions like comma categories allows us to express the orthogonality and 
consistency of the three dimensions while inheriting many theoretical results. 

Our framework does not yet cover all important aspects of software develop- 
ment. Some of these aspects have been investigated in the graph transformation 
literature for simpler kinds of systems (e.g., without typing or hierarchical struc- 
ture). In particular, the software development process with its different phases 
of analysis, design, and implementation and the refinement steps between these 
phases are subject of ongoing research (see, e.g., [29l25II7l5,'fj T Horizontal struc- 
turing techniques for models and programs like modules, views, or packages have 
been studied, for example, in j,'f ll,'fl)l,'I2l49j . In order to support the transition 
from design to implementation, high-level programming and modeling languages 
based on graph transformation are developed (see, e.g., [2,4141 )l22j l. 

Most of these approaches are neither consistent nor strictly orthogonal to 
the conceptual framework presented her. It remains future work to identify the 
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relation between the various concepts and to incorporate them into a consistent 
overall model of software development. 
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Abstract. We study the complexity of proving the Pigeon Hole Princi- 
ple (PHP) in a monotone variant of the Gentzen Calculus, also known 
as Geometric Logic. We show that the standard encoding of the PHP as 
a monotone sequent admits quasipolynomial-size proofs in this system. 
This result is a consequence of deriving the basic properties of certain 
quasipolynomial-size monotone formulas computing the boolean thresh- 
old functions. Since it is known that the shortest proofs of the PHP in 
systems such as Resolution or Bounded Depth Frege are exponentially 
long, it follows from our result that these systems are exponentially sepa- 
rated from the monotone Gentzen Calculus. We also consider the mono- 
tone sequent (CLIQUE) expressing the clique- coclique principle defined 
by Bonet, Pitassi and Raz (1997). We show that monotone proofs for 
this sequent can be easily reduced to monotone proofs of the one-to-one 
and onto PHP, and so CLIQUE also has quasipolynomial-size monotone 
proofs. As a consequence. Cutting Planes with polynomially bounded 
coefficients is also exponentially separated from the monotone Gentzen 
Calculus. Finally, a simple simulation argument implies that these re- 
sults extend to the Intuitionistic Gentzen Calculus. Our results partially 
answer some questions left open by P. Pudlak. 



1 Introduction 

One of the main approaches to attack the NP yf co-NP question is that of 
studying the length of proofs in propositional calculi. In a well-known result, 
Gook and Reckhow nn proved that if all propositional proof systems are not 
polynomially bounded, that is, if they have families of tautologies whose shortest 
proofs are superpolynomial in the size of the formulas, then NP co-NP. In 
spite of the simplicity of propositional proof systems such as the Hilbert Galculus 
(Frege system) or the Gentzen sequent Galculus, we are admitedly far at present 
from proving that these systems are not polynomially bounded. Surprisingly, one 
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of the main difficulties is the lack of families of tautologies candidate to be hard 
for these systems. 

Nevertheless several important results have been obtained for less power- 
ful but not trivial proof systems. Strong lower bounds are actually known for 
systems such as Resolution |18lldl5I12ll4| , Bounded Depth Frege m and Poly- 
nomial Calculus m The common point among these results is the family of 
formulas that is considered to give the exponential lower bounds. These formu- 
las encode a basic combinatorial principle known as the Pigeon Hole Principle 
(PHP™), saying that there is no one-to-one mapping from a set of m elements 
to a set of n elements, provided m > n. Resolution was the first proof system 
for which an exponential lower bound was proved for the size of refutations of 
the PHP”“''^, a well-known result due to Haken HE!. This result was general- 
ized to PHP™, for m linear in n, by Buss and Turan [E|. The same formula, 
PHPJJ“''^, was later used by Ajtai P to give a superpolynomial size lower bound 
for a system that subsumes Resolution: Bounded Depth Frege. This result was 
simplified and improved up to an exponential lower bound by Beame et al. p. 
The complexity of the PHP™ is also well-studied in algebraic-style propositional 
proof systems. Recently, Razborov HE] (see also HE)) showed that PHP™ is also 
hard for the Polynomial Calculus (notice that Riis HOI showed that a different 
encoding of PHPJJ'*'^ restricted to bijective maps has constant degree proofs in 
the Polynomial Calculus). Actually one of the most interesting problems is to 
know the exact complexity of Resolution refutations of PHP™, when m > 



giEEni. Thus, in spite of its simple combinatorial nature, PHP))^^ is one of 
the most commonly used principles to give proof complexity lower bounds. For 
this reason, in studying the complexity of a new proof system, it is important to 
consider the complexity of proving PHP(J“''^ as a first step. After Haken’s lower 
bound, it was conjectured that PHP();“''^ would also be hard to prove for more 
powerful proof systems, such as Frege. The conjecture was refuted by Buss jOj, 
who exhibited polynomial-size proofs in Frege, or equivalently, in the Gentzen 
Calculus. It is also known that PHP”^^ has polynomial-size proofs in Cutting 
Planes and that the slightly weaker form PHP^" has quasipolynomial-size 
proofs in Bounded Depth Frege Eas). 

Monotone proof systems, that is, proof systems restricted to propositional 
formulas over the monotone basis {A, V}, were considered by Pudlak and Buss 
and more recently, by Pudlak PJ, and Clote and Setzer uni There are 
several alternative definitions of monotone proof systems. Here we consider the 
Monotone Gentzen Calculus, called Geometric Logic in Although the only 
monotone tautological formula is the true constant 1, Pudlak suggests the study 
of tautological sequents of the form A ^ B, where A and B are boolean formulas 
built over the monotone basis {A, V}. Several interesting combinatorial principles 
can be put in this form; for example, PHP(J“''^. 

The correpondence between circuit complexity classes and proof systems in- 
spires new techniques to obtain both upper and lower bounds for proofs. Ex- 
amples are the lower bound of Beame et. al. ^ for Bounded Depth Frege (also 
known as ACq Frege), in which they used an adaptation of Hastad’s Switch- 
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ing Lemma, and the polynomial upper bound of Buss for PHP™ in Frege 
(or NCi-Frege) using an NCi circuit for addition. While strong lower bounds 
for monotone circuits were given more than ten years ago (see 1770 1 , non- 
trivial lower bounds for monotone proof systems are not known yet. Hence, 
one of the basic questions is whether PHP((^^ can be used to obtain exponen- 
tial lower bounds for these systems. This question is also important since the 
(non-monotone) Frege proofs of PHP((“''^ given by Buss formalize a counting 
argument, and it is not clear how to formalize counting arguments into short 
monotone proofs. See the paper by Pudlak 1231 for a further discussion on this 
topic (see also [El). 

In this work we exhibit quasipolynomial-size proofs of PHP(^“''^ in the Mono- 
tone Gentzen Calculus. To obtain this result, we consider quasipolynomial- 
size monotone formulas to compute the boolean threshold functions. While 
polynomial-size monotone formulas are known for these functions p3l2j , Pudlak 
remarks that it is not clear whether their basic properties have short monotone 
proofs. First, Valiant’s construction 1331 is probabilistic, and therefore, it does 
not provide any explicit formula to work with. Second, the sorting network of 
Ajtai, Komlos, and Szemeredi | 2 | makes use of expanders graphs, and there is 
little hope that their basic properties will have short monotone proofs. Here we 
address the difficulty raised by Pudlak by considering explicit quasipolynomial- 
size monotone formulas th^(a;i, . . . , x„) to compute threshold functions. We 
show that the basic properties of , Xn) admit quasipolynomial-size 

monotone proofs. In particular, we prove that for any permutation tt the se- 
quent th^(a:i, . . . , Xn) b th^(x^(i), . . . , x^(„)) has quasipolynomial-size monotone 
proofs. 

We remark that our proofs can be made tree-like, but details are omitted 
in this version. For non-monotone Gentzen Calculi, Krajicek m proved that 
tree- like proofs are as powerful as the unrestricted ones. But it is not known at 
present whether this holds for the monotone case, as the same technique does 
not apply. 

We also consider the formula CLIQUE^ expressing the (n, A:)-Clique-Coclique 
Principle, used by Bonet, Pitassi and Raz, and for which an exponentialy lower 
bound in Cutting Planes with polynomially bounded coefficients (poly-CP) was 
proved |H| (notice the difference with the Clique Principle with common variables 
introduced by Krajicek in 1211 , and used by Pudlak in | 23 ] to obtain exponen- 
tial lower bounds for Cutting Planes with unrestricted coefficients. The latter is 
not a monotone tautology of the form A ^ B). We show that monotone proofs 
for the monotone sequent obtained from the formula CLIQUE^ can be reduced 
to monotone proofs of the onto version of PHP^_j^, which in turn can be eas- 
ily reduced to the standard PHP^_j^. This way, we obtain quasipolynomial-size 
monotone proofs of CLIQUE^ 

Our results imply that Resolution, Bounded-depth Frege, and poly-CP are 
exponentially separated from the (tree-like) Monotone Gentzen Calculus. Finally, 
as remarked in PI, a simple simulation argument shows that every proof in 
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the Monotone Gentzen Calculus, is also a proof in the Intuitionistic Gentzen 
Calculus. Hence, all our results also hold for this system. 

2 Preliminaries 

A monotone formula is a propositional formula without negations. The Monotone 
Gentzen Calculus (MLK), also called Geometric Logic \24\ . is obtained from 
the standard Gentzen Calculus (LK [d I] 1 when only monotone formulas are 
considered, and the negation rules are ignored. As usual, a proof in MLK is a 
sequence of sequents, or lines, of the form F \- A each of which is either an initial 
axiom, or has been obtained by a rule of MLK from two previous lines in the 
sequence. The sequence constitutes a proof of the last sequent. When we restrict 
the proofs in such a way that each derived sequent can be used only once as a 
premise in a rule of the proof, we say that the system is tree-like. 

The overall number of symbols used in a proof is the size of the proof. Let A 
and Hi, ... , be formulas, and let Xi, . . . ,Xn be propositional variables that 
may or may not occur in A. We let A{xifBi, . . . ,Xn/Bn) denote the formula 
that results from A when all occurrences of xi (if any) are replaced by Bi (re- 
placements are made simultaneously). Observe that if A and B are monotone 
formulas, then A(xfB) is also monotone. The non-monotone version of the fol- 
lowing Lemma appears in 1711 01 (monotonicity is only needed in part (v), and 
the proof is straightforward). 

Lemma 1. For every monotone formula A, the sequents (i) A, x h A{x/1), (ii) 
A\- X, A{x/Q), (Hi) A(x/1), X h A, (iv) A(x/0) h x. A, and (v) A(x/0) h A(x/1), 
have MLK-proo/s of size quadratic in the size of A. 

For every n and fc G {0, . . . ,n}, let TH^ : {0, 1}” — >■ {0, 1} be the boolean 
function such that TH^(oi, . . . , a„) = 1 if and only if every 

(ai,...,a„) G {0,1}". Each TH^ is called a threshold function. Valiant jS3| 
proved that every threshold function TH^ is computable by a monotone formula 
of size polynomial in n. The proof being probabilistic, the construction is not 
explicit. In the same paper. Valiant mentioned that a divide and conquer strat- 
egy leads to explicit quasipolynomial-size monotone formulas for all threshold 
functions. The same construction appears in the book by Wegener P5|, and in 
the more recent book by Vollmer m- Here we revisit that construction with a 
minor modification. We define monotone formulas th()(x) := 1 and th}(x) := x, 
and for every n > 1 and fc G {0, . . . , n|, define the formula 

thfc(xi, . . . ,x„) := \J (th”/^(xi, . . . ,x„/ 2 ) A thJ“”/^(x„/ 2 +i, . . . ,x„)), 

where = {(*, j) : 0 < i < n/2, 0 < j < n — n/2, i + j > k} and n/2 is 
an abbreviation for [n/2j. It is straightforward to prove that th^(xi, . . . , x„) 
computes the boolean function TH^. On the other hand, it is easy to prove, by 
induction on n, that the size of th^(xi, . . . , x„) is bounded by for some 

constant c > 0; that is, the size of th^(xi, . . . , x„) is quasipolynomial in n. 



Monotone Proofs of the Pigeon Hole Principle 155 



3 Basic Properties of Threshold Formulas 

We establish a number of lemmas stating that the elementary properties of the 
threshold formulas admit short MLK-proofs. Here, short means size polynomial 
in the size of the formula th^(a;i, . . . ^Xn), and therefore, size quasipolynomial 
in n. The first properties are easy: 

Lemma 2. For every n,m,k G IN with m < n/2, and k < n — n/2, and 
for every h, s G TN with n > h > s, the sequents (i) h thg (xi, . . . , a:„), (ii) 
thdf{xi,...,Xn) b /\iX^, ("m; th”/^(xi,...,x„/ 2 ) A th”"”/^(a;„/ 2 +i,...,a;„) h 
th^^^(a:i, . . . , Xn), and (iv) th^(a;i, . . . , Xn) b th” (a:i, . . . , Xn) have MLK-proofs 
of size quasipolynomial in n. 

In the next lemmas we give MLK-proofs of the basic properties relative to 
the symmetry of the threshold formulas (Theorem Q] below) . 

Lemma 3. For every n, m,k,l G IN, with 0<m<n,0<k<n, and 0 < I < n, 
the sequents 

(i) thfc_|_i(xi, . . -,xi/l, . . . ,x„) b thfc(cci, . . . ,a;;/0, . . . ,x„) 

(ii) th” . . .,xi/t), . . . ,x„) b th” (xi, . . .,xi/l , . . . ,x„) 

have MLK-proofs of size quasipolynomial in n. 

Proof: We first show (i). We use induction on n, where the base case is thj(l) b 
thj)(0). Assume without loss of generality that I < n/2, that is, xi is in the first 
half of the variables. Recall the definition of tLf.j^i{xi, . . . ,xi/l, . . . , Xn): 



Fix (i,j) G /fc+i, let p = n/2, and lei q = n — n/2. If i = 0, then j > k -\- 1 
and thJ(a:ji/ 2 -i-ii • ■ • i 2 ;„) b th^(a;„/ 2 -i-ii • ■ • iXn) by part (iv) of Lemma 0 Since 
b thp(a;i, . . . , x;/0, . . . , x^/z) by part (i) of Lemma|2l right A-introduction gives 
tb|(a:„/ 2 +i, ■.■,Xn) b thg(a;i, . . . , xi/Q, x^j-z) A th® (x„/ 2 +i, ■ • • , a;„), and so 
thj(a: n/ 2 -i-i: • ■ • j a;„) b th^(a:i, . . . , Xi/i ), . . . , cc„) by a cut with part (iii) of Lemma 
13 Left weakening and left A-introduction gives then i)af/{x\, . . . ,xi/l, . . . , Xn/ 2 )^ 
thRx„/ 2 -i-i) • ■ • ) a^n) b th^(a;i, . . . , x;/0, . . . , a;„) as desired. If f > 0, we have 
thf(a;i, . . -,xi/l, . . .,Xn/ 2 ) b thf_^(xi, . . . ,xj/0, . . . ,a;„/ 2 ) by induction hypoth- 
esis on n. Therefore, easy manipulation using part (iii) of Lemma|3as before gives 
thf(a;i, . . .,xi/l, . . . ,a;„/ 2 )Ath](a;„/ 2 +i, . . . ,a:„) b i\x/_.^^j{xi, . . . ,x//0, . . . ,x„). 
Finally, since f — l-|-fc>fc, a cut with part (iv) of Lemma Ogives the result. 
The proof of (ii) is very similar. □ 

Lemma 4. For every m,n,k,l G IN with 1 < k < I < n, and m < n, the 
sequents 




(i) th” (xi, . . . , Xfc/I, . . . , xi/0, ...,Xn)\- th” (xi, . . . , Xk/0 , . . . , xi/1 , . . . , x„) 



156 



A. Atserias, N. Galesi, and R. Gavalda 



(a) th” (xi, . . . ,Xfc/0, . . . ,X//1, . . . ,x„) h th” (xi, . . . ,Xfe/l, . . . ,X;/0, . . . ,x„) 
have MLiK-proofs of size quasipolynomial in n. 

Proof: Both proofs are identical. It is enough to prove (i) when k < n/2 < I, 
that is, when Xk falls in the first half of the variables and x; falls in the sec- 
ond half of the variables. The complete proof of (i) would then be a simple in- 
duction on the recursive definition of . . . , Xfc/1, . . . , x;/0, . . . , Xn) whose 

base case is when k < n/2 < 1. Notice that the base case is eventually reached, 
at latest, when n = 2. So assume k < n/2 < I and recall the definition of 
th^(xi, . . . , Xfc/1, . . . , x;/0, . . . , x„): 

V ■ ■ ■ ’ ’ ^n/2) thJ"”/^(x„/2+i, . . . , Xi/0, . . . , Xn))- 

Fix (i,j) G Iff, let p = n/2, and let g = n — n/2. If i > 0, then Lemma 
0 shows that thf(xi, . . . ,Xfc/l, . . .,Xnji) I" thf'_^(xi, . . . ,Xfc/0, . . . ,x„/ 2 ). Simi- 
larly, thJ(x„/ 2 +i, . . . ,X;/0, . . . ,x„) h thJ_^^(x„/ 2 +i, . . . ,X;/1, . . . ,x„) whenever 
j < n — n/2. From these two sequents, the result follows easily when i > Q 
and j < n — n/2. Consider next the case in which either i = 0 or j = n — 
n/2. li j = n - n/2, then t\P^_^^.^{xn/ 2 +i, ■ ■ ■ ,xi/Q, . . . ,Xn) is just provably 
false by part (ii) of Lemma El and the result follows easily. If i = 0, then 
th^(xi, . . . , Xfe/0, . . . , Xn/ 2 ) is just provably true by part (i) of Lemma El On the 
other hand, thj(x„/ 2 +i, . . . ,x;/0, . . . ,x„) h thj(x„/ 2 +i, . . . ,x;/l, . . . ,x„) follows 
by part (v) of Lemma^ and the result follows too. □ 

Lemma 5. For every m, n, i, j G IM, with m < n and 1 < i < j < n, the sequent 
th^(xi , . . . ,Xi, . . . ,Xj, . . . ,Xn) th” (xi , . . . ,Xj, . . . ,Xi, . . . ,Xn) has MLK-proofs 
of size quasipolynomial in n. 

Proof: We split the property according to the four possible truth values of xt 
and Xj. Namely, we will give proofs of the following four sequents from which 
the lemma is immediately obtained by the cut rule. 



(i) th”(xi,.. 


. , . . 


. ,Xj, . . 


.,Xn),Xi,Xj h th” (xi, . . 


■ ,Xj, . . 


■ ,Xi, . . 




(ii) th”(xi,.. 


. , Xi, . . 


. ,Xj, . . 


. ,x„),Xi h Xj,th” (xi, . . 


. ,Xj, . . 


.,Xi,.. 


■ J ^n) 


(hi) th”(xi,.. 


. , Xi, . . 


. ,Xj, . . 


.,Xn),Xj h Xi,th'/f{xi, . . 


. . ,Xj, . . 


.,Xi,.. 


■ J ^n) 


(iv) th”(xi,.. 


. , Xi, . . 


. ,Xj, . . 


. ,x„) h Xi,Xj,th'/f{xi, . . 


. ,Xj, . . 


.,Xi,.. 


■ J ^n) 



We only show (ii), the rest are similar. Two applications of Lemma Q give 
th” (xi, . . . ,Xi, . . . ,Xj, . . . ,x„),Xi h Xj,th” (xi, . . . , 1, . . . ,0, . . . ,x„). Lemma El 
gives th” (xi, . . . ,Xi, . . . ,Xj, . . . ,x„),Xj h x^, th” (xi, . . . , 0, . . . , 1, . . . , x„), and 
two more applications of Lemma [Dagain give th((,(xi, . . . , 0, . . . , 1, . . . , x„), Xi h 
Xj ,ih/f{xi, . . . ,Xj, . . . ,Xi,. . . ,Xn)- Finally, a cut between the last two sequents 
gives (ii) . The size of the proof is quasipolynomial since we are applying Theorem 
Don th((,() whose size is quasipolynomial in n. □ 

Since every permutation on {1, . . . ,n} can be obtained as the composition 
of (polynomially many) permutations in which only two elements are permuted 
(transpositions). Lemma 0 easily implies the following theorem. 
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Theorem 1. For every m,n £ IN, with m < n, and for every permutation tt 
over {1, . . . ,n} the sequent , a;„) h . . . , /las MLK- 

proofs of size quasipolynomial in n. 

The next two properties state that the smallest threshold formulas are prov- 
ably equivalent to their usual formulas. The proof is omitted. 

Lemma 6. For every n € IN, the sequents 

(i) 

^ Xj) th 2 (a;i, . . . ,x„); 
have MLiK-proofs of size polynomial in n. 

The next lemma states that threshold functions split by cases: 

Lemma 7. For every m, n G IN with m even, m < n, and n an exact power of 
two, the sequents 

(i) th^_j_-j^ (xi , . . . , Xjf) \~ (^ 1 5 ■ ■ ■ 5 ^n/ 2 ) 1 ^^m/ 2+1 i^^n/ 2+1 •>•••: ^n); 

(a) th” (xi, . . . , x„) h th”//^ 2 +i(^i> ■ ■ ■ > a^n/ 2 ), th”//^ 2 ( 2 ^n/ 2 +i> • • • , a:„), 
have MLiK-proofs of size quasipolynomial in n. 

Proof: We first prove (i). Fix i,j < n/2 such that i + j > m + 1. Since m is even, 
either i > m/2 + 1 or / > m/2 + 1 for otherwise i + j < m. In the former case 
we get th/^^(xi, . . . ,x„/ 2 ) F th”''/^ 2 +i(®i^ ■ • ■ >a;„/ 2 ),th”''^^ 2 +i( 2 ^n/ 2 +i: ■ • ■ :2;n) by 
part (iv) of Lemma 0 and the rule of right weakening. In the latter case we get 
th”^^(x„/ 2 +i, ■■■,Xn) F th”'^^^2+i(2^i> ■ ■ ■ .2;n/2),th”''/^2+i(^n/2+i, • ■ ■ ,x„), and so 
the rule of left A-introduction puts these together in a single sequent. Since this 
happens for every i,j < nj2 such that i + j > m+ 1, we get . . . , x„) F 

&s required. The proof of (ii) is 
extremely similar. Given i,j < n/2 such that i + j > m, either i > m/2 + 1 
or j > m/2. In the former case, as before using part (iv) of Lemma 0 we have 
th”^^(xi,.. . ,x„/ 2 ) F th”//^2+i(2^i> ■ ■ • >a;„/ 2 ),th”//^ 2 (®"/ 2 +i, - ,a:„). In the latter 

case prove th”/^(x„/ 2 +i, . . . ,x„) F . . • , x„/ 2 ), th”/^^ 2 (^"/ 2 +i> • • • 

as before. Manipulation as in part (i) gives property (ii). □ 



4 Monotone Proofs of PHP 

The Pigeon Flole Principle states that if n + 1 pigeons go into n holes, then 
there is some hole with more than one pigeon sitting in it. It is encoded by the 
following (non-monotone) formula 



n+1 n n n+1 

PHP^i ■■= t\\j y {p^,k^Pxk). 

i—l j — 1 k—1 i,j—l 
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Observe that the Pigeon Hole Principle can be obtained as a monotone sequent 
simply replacing the symbol — ^ above by the symbol h. From now on we refer 
to the left part of the sequent as LPHP„, and to the right part of the sequent 
as RPHP„. The sequent itself is denoted PHP„. 

We first see that PHP„ can be reduced to the case in which n is an exact 
power of two. The proof of this lemma is easy and omitted. 

Lemma 8. There exists a polynomial p{n) sueh that, for every to, 5 £ IN, if the 
sequent PEP^ has a MLK-proof of size at most S, then, for every n < m, the 
sequent PHPn has a MLK-proo/ of size at most S + p(n). 

Theorem 2. The sequents PHP„ have MLiK-proofs of quasipolynomial-size. 

Proof: We first outline the idea of the proof. From the antecedent of PHP„ we 
immediately derive that for each pigeon i there is at least one variable pij that 
is true (th"^^(pi_i, . . . We deduce that among all variables grouped by 

pigeons, at least n + 1 are true (th”^+^^(pi,i, . . . . . . ,p„+i_i, . . . ,p„+y„)). 

The symmetry of the threshold allows us to show that the same holds when the 
variables are grouped by holes . . . ,p„+i,i, . . . ,pi,n, . . . ,pn+i,n))- 

From this, at least one hole contains two pigeons (th 2 ~'’^(pi,i, . . . ,Pn+i,i) for some 
i G {1, . . . , n}), and this implies RPHP„. 

According to Lemma 0 it is enough to give quasipolynomial size proofs of 
PHP„ when n + 1 is a power of two, since there is always a power of two between 
n and 2n. So let us assume n = 2’' — 1 for some r £ IN. For technical reasons 
in the proof we will consider a squared form (instead of rectangular form) of 
PHP„ where we assume the existence of an (n + l)-st hole in which no pigeon 
can go. So, we introduce n + 1 new symbols Pi,n+i! • ■ ■ ,Pn+i,n+i that will stand 
for the constant 0. For every i £ {1, . . . , n + 1}, let pi = {pip, . . . ,pi^n+i), and 
let qi = (pi,i, . . . ,pn+i,i) (hence (?n+i = (0, . . . , 0) is the sequence of n + 1 zeros). 
Consider the following four sequents. 



LPHP„ h K^^hhf+\p,) 




(1) 


Arj'i^th”+^(p,) h (pi, . . . 


1 Pn-\-l) 


(2) 


th^+Y^ (Pl, . . . ,P„-k) b th^+Y^ 




(3) 


thi”+Y^'(gi,...,9„+i)FRPHP„ 




(4) 



In the next lemmas we show how to prove these sequents with quasipolynomial 
size MLK-proofs. A MLK-proof of LPHP„ h RPHP„ of size quasipolynomial in 
n will follow by three applications of the cut rule. □ 

Lemma 9. Sequent m has MljK-proofs of size polynomial in n. 

Proof: For each i £ {1, . . . , n + 1} derive the sequents Vj=iPi,i ^ V 0 

using right weakening and right V-introduction. Then, n right A-introductions 
and n left A-introductions give LPHP„ h (pi) by the definition of 

LPHP„ and a cut on part (i) of Lemma0. The size of the whole proof is quadratic 
in n. □ 
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Lemma 10. Sequent ^ has MLK-proo/s of size quasipolynomial in n. 

Proof: Recall that n + 1 = 2’'. Let N = {n + 1)^. The idea of this proof is 
to successively pack the conjuncts of the antecedent into a unique threshold 
formula, following a complete binary tree structure of height log 2 (n+ 1) = r. For 
every w S {0, 1}’', let p*" = where w is the position of w in the lexicographical 
order on {0,1}’'. Thus, p° = p\ and p^ = Pn+i- For every w £ {0,1}^’', let 
p’" = (p“°,p’"^). Observe that p^ = (pi, . . . ,p„+i), where A is the empty word. 
For each t £ (1, . . . , r}, we exhibit a MLK-proof of 

A A ( 5 ) 



of size quasipolynomial in n. Once we have all these proofs, we only have to cut 
sequentially to obtain the lemma. We prove sequent El For a fixed t £ (1, . . . , r| 
and a fixed w £ (0, 1}*“^, an application of part (iii) of Lemma El gives 



th 



N/2* 

ln+l)/2 



(p“°)Athf4^;)/2.(p“^ 



\ |_ ^ In’"') 



We put all these formulas in a unique conjunction using A-introduction to get 
sequent El The size of the proof is clearly quasipolynomial in n. □ 



Lemma 11. Sequent 0 has MLK-proofs of size quasipolynomial in n. 

Proof: Immediate from Theorem Q because qi, . . . ,qn+i is a permutation of 

pi,...,p„+i. □ 



Lemma 12. Sequent m has MLiK-proofs of size quasipolynomial in n. 

Proof: The idea of this proof is to unfold the threshold formula in the antecedent 
into disjunctions of threshold formulas computing the number of pigeons going 
into each hole. The unpacking process follows the structure of a complete binary 
tree of height log 2 (n + 1) = r in reverse order of that of Lemma E3 We use 
properties (i) and (ii) of LemmaQto perform this process. 

Recall that n+1 = 2’'. Let N = (n+1)^. Define g™ = quj for every w £ (0, 1}’', 
where w is defined as in the proof of Lemma El For every w £ (0, 1}^’’, define 
qW _ Observe that q^ = (gi, . . . ,g„+i). For every t £ (0, . . . ,r — 1} 

and w £ {0, 1}*, properties (ii) and (i) of Lemma0give 



th 



N/2* 

(n+l)/2 



*(g“')Fth 



AT/2‘+’ 
(n+l)/2‘+i + l 



(g-°), th 



AT/2‘+’ 

(n+l)/2‘+i 



(g-1) 



hW/2* fo-j h th^/2‘+' (a^O) th^/2‘+' 

''’^An+l)/2‘ + lW ) “l(„+l)/2‘+i + lW b Wl(„+i)/2t+i + l 



(g-1). 



Appropriate cuts and the definition of g’" for w £ (0, 1}’' show then that 
th4i(g^) F th2+\gi),th2+\g2), . . . ,th2+^(g„),th”+\g„+i). 



Since gn+i = (0,...,0), we immediately have that th"“''^(g„+i) h 0 by part 
(ii) of Lemma El so that the result follows by a cut on 0 h, successive cuts on 
part (iv) of Lemma El and right V-introduction. The size of the proof is again 
quasipolynomial in n. □ 
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5 Separation Results 

A graph G is /c-clique if there is a set of k nodes of G such that any two distinct 
nodes of the set are connected by an edge, and no other edge is present in G. A 
graph G is a A:-coclique if there is a partition of the nodes of G into k disjoint 
sets in such a way that any two nodes that belong to different sets are connected 
by an edge, and no other edges are present in G. 

The (n, fc)-clique-coclique principle of 0 says that, given a set V of n nodes, if 
G is a fc-clique over V and H is a (fc — l)-coclique over V, then there is an edge in 
G that is not present in H . This principle may be stated as a monotone sequent 
CLIQUE^ as follows. For every I G {1, . . . , fc} and i G n}, let xu be a 

propositional variable whose intended meaning is that i is the Z-th largest node 
of the fully connected set which forms a fixed fc-clique over {1, . . . , n}. Similarly, 
for every I G 1} and i € {1, . . . , n}, let uh be a propositional variable 

whose intended meaning is that the *-th node is in the ^-th disjoint set of a fixed 
(fc — l)-coclique over {1, . . . , n}. The principle is then expressed as follows 

k n n k—1 k—1 k n k n 

AV xu^ AV yi'i V V V {xiiAxi>jAyuAytj)V \J \j [xu Axih)- 

1 — 1 i—1 i—1 I' — l t—1 = i i—1 

As in jHj, the reduction of CLIQUE^ to PHPf,_i is accomplished by the substi- 
tution of variable pi i> in PHPjt_i by the monotone formula Vr=i(^Ji ^ Vi'i)- The 
details of the reduction are easy to work out in MLK, and are left to the long 
version of the paper. 

Corollary 1. The sequents CLIQUE^ have MLK-proo/s of quasipolynomial- 
size. 

Putting together our upper bounds for PHP"^^ and for CLIQUE^ with the 
exponential lower bounds in Resolution m, Bounded Depth Frege m, and 
poly-CP |HI, we obtain the following separations result: 

Theorem 3. Resolution, Bounded-Depth Frege and poly-CP are exponentially 
separated from the Monotone Gentzen Calculus. 

The Intuitionistic Gentzen Calculus forbids sequents with more than one for- 
mula in their consequent (see m for a precise definition) . As observed by Pudlak 
m, there is a simple simulation of the Monotone Gentzen Calculus by the In- 
tuitionistic Gentzen Calculus. The simulation consists in replacing consequents 
with more than one formula by the disjunction of these formulas. This simple 
simulation implies that all our results also hold for the Intuitionistic Gentzen 
Calculus. 

In Pudlak proves that the Intuitionistic Gentzen Calculus enjoys a fea- 
sible interpolation property. It is also asked in 121 whether the feasible interpo- 
lation can be made monotone. While we have been able to provide a quasipoly- 
nomial upper bound for the size of intuitionisitic proofs of an encoding of the 
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Clique-Coclique Principle, it is not clear whether the encoding of the Clique 
Principle on which to apply the interpolation property (the one with common 
variables as in EP) enjoys the same upper bound. The reason is that the result- 
ing sequent is not monotone anymore, and our reduction method does not apply. 
On the other hand, a positive answer would imply that the disjointness property 
for the Intuitionistic Gentzen Calculus would belong to P/poly — mP/poly. In 
fact, the disjointness property would be computable by a (uniform) polynomial- 
size circuit (see HH for a proof of this fact), but would not be computable by 
a monotone polynomial-size circuit, since otherwise, the Intuitionistc Gentzen 
Calculus would admit the monotone feasible interpolation property. 

Acknowledgments. We thank Maria L. Bonet for helpful comments and insights, 
Pavel Pudlak for comments on a preliminary version and pointing out that our proofs 
also holds for the tree-like case. Toni Pitassi has informed us that she obtained Theorem 
El independently. We thank her for reading a preliminary version of this paper. We also 
thank the anonymous ICALP referees for useful comments. 



References 

1. M. Ajtai. The complexity of the pigeonhole principle. Combinatorica, 14, pp. 417- 
433, 1994. 

2. M. Ajtai, J. Komlos, E. Szemeredi. An 0(n log n) sorting network. Combinatorica, 
3(1), pp. 1-19, 1983. 

3. N. Alon, R. B. Boppana. The monotone circuit complexity of boolean functions. 
Combinatorica, 7, pp. 1-22, 1987. 

4. P. Beame, R. Impagliazzo, J. Krajicek, T. Pitassi, P. Pudlak, A. Woods. Exponen- 
tial lower bounds for the Pigeon Hole Principle. STOC’92, pp. 200-220, 1992. 

5. P. Beame, T. Pitassi. Propositional Proof Complexity: Past, Present and Future. 
Bulletin of the European Association for Theoretical Computer Science, 65, 1998. 

6. P. Beame, T. Pitassi. Simplihed and Improved Resolution Lower Bound. FOCS’96, 
pp. 274-282, 1996. 

7. M. Bonet, C. Domingo, R. Gavalda, A. Maciel, T. Pitassi. Non-automatizability 
of Bounded-Depth Frege Proofs. IEEE Conferenee on Computational Complexity, 
1998. 

8. M. Bonet, T. Pitassi, R. Raz. Lower Bounds for Cutting Planes Proofs with small 
Coefficients. Journal of Symbolic Logic, 62 (3), pp. 708-728, 1997. A preliminary 
version appeared in STOC’95. 

9. S. R. Buss. Polynomial size proofs of the propositional pigeon hole principle. Jour- 
nal of Symbolic Logic, 52 (4), pp. 916-927, 1987. 

10. S. Buss. Some remarks on length of proofs. Archive for Mathematieal Logic, 34, 
pp. 377-394, 1995. 

11. S. Buss, G. Mints. The complexity of disjunction and existence properties in intu- 
itionistc logic. Preprint, 1998. 

12. S. Buss, T. Pitassi. Resolution and the weak Pigeonhole principle. Invited Talk to 
CSL 97. To appear in Selected Papers of the 11-th CSL, Lecture Notes in Gomputer 
Science, 1998. 

13. S. R. Buss, G. Turan. Resolution proofs of generalized pigeonhole principles. The- 
oretical Computer Seience, 62 (3), pp. 311-317, 1988. 



162 



A. Atserias, N. Galesi, and R. Gavalda 



14. V. Ghvatal E. Szemeredi. Many hard examples for resolution. Journal of the As- 
sociation for Computer Machinery, 35, pp. 759-768, 1988. 

15. P. Clote, A. Setzer. On PHP, st-connectivity and odd charged graphs. Proof Com- 
plexity and Feasible Arithmetics, 93-118, DIMAGS Series in Discrete Mathematics 
and Theoretical Computer Science, Vol 39, eds. Paul W. Beame and Samuel R. 
Buss, 1998. 

16. S. Cook, R. Reckhow. The relative efficiency of propositional proof systems. Journal 
of Symbolic Logic, 44, pp. 36-50, 1979. 

17. W. Cook, C. R. Coullard, G. Turan. On the complexity of Cutting Plane proofs. 
Discrete Applied Mathematics, 18, pp. 25-38, 1987. 

18. A. Haken. The intractability of resolution. Theoretical Computer Science, 39 (2-3), 
pp. 297-305, 1985. 

19. R. Impagliazzo, P. Pudlak, J. Sgall. Lower Bounds for the Polynomial Calculus 
and the Groebner basis Algorithm. ECCC TR97-042. To appear in Computational 
Complexity. 

20. J. Krajicek. Speed-up for propositional Frege systems via generalizations of proofs, 
Commentationes Mathematicae Universiatatis Carolinae,30, 1989, pp. 137-140. 

21. J. Krajicek. Interpolation theorems, lower bounds for proof systems and indepen- 
dence results for bounded arithmetic. Journal of Symbolic Logic, 62, pp. 457-486, 
1997. 

22. A. Maciel, T. Pitassi, and A. R. Woods. A New Proof of the Weak Pigeonhole 
Principle. To appear in STOC’OO. 

23. J. B. Paris, A. J. Wilkie, and A. R. Woods. Provability of the pigeonhole principle 
and the existence of infinitely many primes. Journal of Symbolic Logic, 53 (4), pp. 
1235-1244, 1988. 

24. P. Pudlak. On the complexity of the propositional Calculus. Logic Colloquium ’97. 
To appear. 

25. P. Pudlak. Lower bounds for resolutions and cutting planes proofs and monotone 
computations. Journal of Symbolic Logic, 62 (2), pp. , 1997. 

26. P. Pudlak, S. Buss. How to lie without being (easily) convicted and the lengths 
of proofs in propositional calculus. 8th Workshop on CSL, Kazimierz, Poland, 
September 1994, Springer Verlag LNCS n.995, pp. 151-162, 1995. 

27. A. Razborov. Lower bounds for the monotone complexity of some boolean func- 
tions. Soviet Math. Doklady, 31 (2), pp. 354-357, 1985. 

28. A. Razborov. Lower bounds for the Polynomial Calculus. Computational Complex- 
ity, 7 (4), pp. 291-324, 1998. 

29. A. A. Razborov, A. Wigderson, A. Yao. Read Once Branching Programs, Rectan- 
gular Proofs of the Pigeonhole Principle and the Transversal Calculus. STOC’97, 
pp. 739-748, 4-6 May 1997. 

30. S. Riis. A Complexity Gap for Tree-Resolution. BRICS Report Series, RS-99-29, 
1999. 

31. G. Takeuti. Proof Theory. North-Holland, second edition, 1987. 

32. A. Urquhart. Hard examples for Resolution. Journal of the Association for Com- 
puting Machinery, 34 (1), pp. 209-219, 1987. 

33. L. Valiant. Short monotone formulae for the majority function. Journal of Algo- 
rithms, 5, pp. 363-366, 1984. 

34. H. Vollmer. Introduction to Circuit Complexity. Springer, 1999. 

35. I. Wegener. The Complexity of Boolean Functions. J. Wiley and Sons, 1987. 




Fully- Abstract Statecharts Semantics 
via Intuitionistic Kripke Models 



Gerald Liittgen^ and Michael Mendler^ 

^ ICASE, Mail Stop 132C, NASA Langley Research Center, 
Hampton, Virginia 23681-2199, USA, luettgen@icase.edu 
^ Department of Compnter Science, Sheffield University, 211 Portobello Street, 
Sheffield SI 5DP, U.K., M.Mendler@dcs.shef.ac.uk 



Abstract. The semantics of Statecharts macro steps, as introduced by 
Pnueli and Shalev, lacks compositionality. This paper first analyzes the 
compositionality problem and traces it back to the invalidity of the Law 
of the Excluded Middle. It then characterizes the semantics via a par- 
ticular class of linear, intuitionistic Kripke models, namely stabilization 
sequences. This yields, for the first time in the literature, a simple fully- 
abstract semantics which interprets Pnueli and Shalev’s concept of failure 
naturally. The results not only give insights into the semantic subtleties 
of Statecharts, but also provide a basis for developing algebraic theories 
for macro steps and for comparing different Statecharts variants. 

1 Introduction 

Statecharts is a well-known design notation for specifying the behavior of em- 
bedded systems 0. It extends finite state machines by concepts of hierarchy 
and concurrency. Semantically, a Statechart may respond to an event entering 
the system by engaging in an enabled transition. This may generate new events 
which, by causality^ may in turn trigger additional transitions while disabling 
others. The synchrony hypothesis ensures that one execution step, a so-called 
macro step, is complete as soon as this chain reaction comes to a halt. 

Pnueli and Shalev presented two equivalent formalizations of Statecharts’ 
macro-step semantics in a seminal paper HE|. However, their semantics violates 
the desired property of compositionality. Huizing and Gerth mu showed that 
combining compositionality, causality, and the synchrony hypothesis cannot be 
done within a simple, single-leveled semantics. Some researchers then devoted 
their attention to investigating new variants of Statecharts, obeying just two of 
the three properties. In Esterel |0] and Argos HSI causality is treated separately 
from compositionality and synchrony, while in (synchronous) Statemate 0 and 
UML Statecharts |Z] the synchrony hypothesis is rejected. Other researchers 
achieved combining all three properties by storing semantic information via pre- 
orders [1 411 7^ or transition systems [bl 1 ,3) . However, no analysis of exactly how 
much information is needed to achieve compositionality has been made, yet. 

This paper first illustrates the compositionality defect of Pnueli and Shalev’s 
semantics by showing that equality of response behavior is not preserved by 
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the concurrency and hierarchy operators of Statecharts (cf. Sec.| 2 I)- The reason 
is that macro steps abstract from causal interactions with a system’s environ- 
ment, thereby imposing a closed-world assumption. Indeed, the studied problem 
can be further traced back to the invalidity of the Law of the Excluded Middle. 
To overcome the problem, we interpret Statecharts, relative to a given system 
state, as intuitionistic formulas. These are given meaning as specific intuitionistic 
Kripke structures m, namely linear increasing sequences of event sets, called 
stabilization sequences, which encode interactions between Statecharts and en- 
vironments. In this domain, which is also characterized algebraically via semi- 
lattices, we develop a fully- abstract macro-step semantics in two steps. First, we 
study Statecharts without hierarchy operators. We show that in this fragment, 
stabilization sequences naturally characterize the largest congruence contained 
in equality of response behavior (cf. Sec. EJ. In the second step, based on a 
non-standard distributivity law and our lattice-theoretic characterization of the 
intuitionistic semantics, we lift our results to arbitrary Statecharts (cf. Sec. EJ. 
We refer the reader to m for the proofs of our results. 

2 Statecharts: Notation, Semantics, &; Compositionality 

Statecharts is a visual language for specifying reactive systems, i.e., concurrent 
systems interacting with their environment. They subsume labeled transition 
systems where labels are pairs of event sets. The first component of a pair is 
referred to as trigger, which may include negated events, and the second as 
action. Intuitively, a transition is enabled if the environment offers all events in 
the trigger but not the negated ones. When a transition fires, it produces the 
events specified in its action. Concurrency is introduced by allowing Statecharts 
to run in parallel and to communicate by broadcasting events. Additionally, basic 
states may be hierarchically refined by injecting other Statecharts. 




Fig. 1. Two example Statecharts 



As an example, the Statechart depicted in Fig. [0 on the left consists of an 
and-state sie, which puts and-state S14 and or-state sag in parallel. Similarly, 
state Si4 is a parallel composition of or-states S12 and S34. Each of these or-states 
describes a sequential state machine and is refined by two basic states. In case 
of S12, basic state si is the initial state which is connected to basic state S2 via 
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transition ti. Here, Si is the source state of ti, state S2 is its target state, 
symbolizes its empty trigger, and a is its action. Hence, ti is always enabled in 
the initial state, regardless of the events offered by the environment. Its firing 
produces event a and switches the active state of S12 to S2- This initiates a causal 
chain reaction, since the generation of a in turn triggers fa which introduces 
event b. As a consequence, ^2 is enabled and fires within the same macro step. 

The Statechart depicted in Fig.Qon the right is like the one on the left, except 
that and-state S14 is replaced by or-state S79. The latter state encodes a choice 
regarding the execution of and from state S7. The trigger of ti is b, i.e., t^ 
is triggered by the absence of event b. Starting with an environment offering no 
event, thus assuming b to be absent, S59 can autonomously engage in ti. The 
generation of a in turn triggers t^, which fires and produces b. However, ti was 
fired under the assumption that b is absent. Since Statecharts is a synchronous 
language and no event can be both present and absent within a macro step, 
this behavior is rejected as globally inconsistent. Thus, the response of S59 to the 
empty environment is not an empty response but failure. 

Statecharts Configurations and Step Semantics. We formalize the State- 
charts language relative to a given set of active states. Let U and T be count- 
able sets of events and transition names, respectively. For every event e € U, 
its negated counterpart is denoted by e. We define e =df e and write E for 
{e I e S A}. With every t gT, we associate a transition trg(t)/act(t) consisting of 
a trigger trg(t) IHJII and an action act(t) 77 , where trg(7) and act(t) are 
required to be finite sets. For simplicity we also write ei • • • Cnjax ■ ■ ■ am for tran- 
sition {ei, . . . , e„}/{ai, . . . , a^}. The syntax of Statecharts terms is the BNF 
C ::= 0 I a; I 7 I C\\C \ C + C, where 7 G T and a; is a variable. Terms not contain- 
ing variables are called configurations. Intuitively, the configuration 0 represents 
a Statechart state with no outgoing transitions (basic state), C|| 7 > denotes the 
parallel composition of configurations C and D (and-state), and C+D stands for 
the choice between executing C or D (or-state) . The latter construct -|- coincides 
with Statecharts’ hierarchy operator which reduces to choice on the macro-step 
level; thus, we refer to operator -|- also as choice operator. In the standard visual 
Statecharts notation, C-l -77 is somewhat more restrictive in that it requires D to 
be a choice of transitions; e.g., (7i||72) -I- {tsWti) is prohibited according to Stat- 
echarts’ syntax, whereas it is a valid configuration in our setting. Semantically, 
however, our generalization is inessential wrt. the semantics of Pnueli and Shalev 
which underlies this work (cf. ^ 2 ])- The set of all configurations is denoted by C 
and ranged over by C and D. The set of “-|-”-free, or parallel, configurations is 
written as PC. We call terms <l>[x\ with a single variable occurrence x contexts 
and write ^ 7 [C] for the substitution of C for x in <P[x]. Contexts of the form x\\C 
and X + C are referred to as parallel contexts and choice contexts, respectively. 
We tacitly assume that transition names are unique in every term, and we let 
trans(C') stand for the set of transition names occurring in C. 

Any Statechart in a given set of active states corresponds to a configuration. 
For example, Statecharts S14 and S79, in their initial states (indicated by small 
arrows in Fig. [Q, correspond to C14 =df 7i||72 and Crg =df 74 -|- 7 g, respectively. 
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The Statecharts depicted in Fig. ^ are then formalized as Cig =df ^^ 56 [Ci 4 ] 
and C 59 =df ^selC'Tg], respectively, where ^56 [a^] =df a;||t 3 . Moreover, since tran- 
sitions are uniquely named in configurations and thus may be associated with 
their source and target states, one can easily determine the set of active states 
reached after firing a set of transitions; see m for details. In this paper, we 
do not consider interlevel transitions and state references which would require 
an extension of our syntax for configurations. However, our semantics should be 
able to accommodate these features, too. 

To present the response behavior of a configuration C, as defined by Pnueli 
and Shalev we have to determine which transitions in trans(C) may fire to- 
gether to form a macro step. A macro step comprises a maximal set of transitions 
that are triggered by events offered by the environment or produced by the firing 
of other transitions, that are mutually consistent (“orthogonal”), and that obey 
causality and global consistency. A transition t is consistent with T C trans(C), in 
signs t £ consistent(C', T), if t is not in the same parallel component as any t' £ T. 
A transition t is triggered by a finite set E of events, in signs t £ triggered (C, E), 
if the positive, but not the negative, trigger events of t are in E. Finally, we say 
that t is enabled in C regarding a finite set E of events and a set T of transitions, 
if f £ enabled(C', A, T) =df consistent(C', T) fl triggered (C, if U act(t)). In- 
tuitively, assuming transitions T are known to fire, enabled (C, if, T) determines 
the set of all transitions of C that are enabled by the actions of T and the envi- 
ronment events in E. We may now present Pnueli and Shalev’s step-construction 
procedure for causally determining macro steps: 

procedure step-construction{C , if); var T 0; 
while T C enabled((7, E, T) do choose t £ enabled(C', E,T)\T; T := T U {t} od; 
if T = enabled(C, if , T) then (return T) else {report failure) 

This procedure nondeterministically computes, relative to configuration C and 
finite environment if, those sets T of transitions that can fire together in a macro 
step. Due to failures raised when detecting global inconsistencies, the construc- 
tion might involve backtracking. The role of failures may be highlighted further 
by a conservative extension of Pnueli and Shalev’s setting that includes an ex- 
plicit failure event T £ ii. It will be instructive to study the semantics with and 
without T in this paper. Now, for each set T returned by the above procedure, 
we say that A =df E U UteT 3 rt(t) 77 is a (step) response, in signs C JJ-e A. 
When T is considered, we also require that T ^ A. If 7f = 0, we simply write 
C IJ. A. Note that 7f may be modeled by a parallel context consisting of the 
single transition -/Tf, i.e., C IJ-e A iff (Cjl • /7f) IJ. A. This macro-step semantics 
induces a natural equivalence relation ~ over configurations, called step equiva- 
lence, satisfying C ^ D, whenever C JJ-e A iff 77 IJ-e A, for all 7f, A 77. For 
simplicity, ~ does not account for target states of transitions since these can be 
encoded as event names. 

The Compositionality Problem. The compositionality defect of the macro- 
step semantics manifests itself in the fact that ~ is not a congruence for the 
configuration algebra. Consider Fig. [J] and assume that states S 2 , S 4 , sq, sg, 
and sg are all equivalent. It is easy to see that configurations C 14 and C 79 have 
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the same response behavior. Both C 14 A and C 79 A are equivalent to 
A = E U {a}, no matter whether event b is present or absent in environment E. 
However, ^selC'w] = C'm 7 ^ Csg = <?56[C79] since Cm -IJ- {a,b} but C59 A, for 
any A. Hence, the equivalence C14 ~ C79 is not preserved by context <P^q[x\. 
Intuitively, C14 and C79 are identified because the response semantics does not 
account for any interaction with the environment. It adopts the classic closed- 
world assumption, stating that every environment event is either present from 
the very beginning of a given macro step or will never arise. This eliminates the 
possibility that events may be generated due to interactions with the environ- 
ment, in this case event b in Ciq JJ. {a, b}. In short, a compositional macro-step 
semantics does not validate the Law of the Excluded Middle b \/ -<b = true. 
Since intuitionistic logic HH| differs from classic logic by refuting the Law of the 
Excluded Middle, it is a good candidate framework for analyzing Statecharts 
semantics. It should be stressed that the compositionality defect is mainly an 
issue of operator || and not of -I-, as we will see below. 

Our goal is to characterize the largest congruence ~, called step congruence, 
contained in step equivalence, where C ~ H, if d>\C] ^ 'L\D] for all contexts ‘P[x]. 
Of course, C ~ iff |C]o = [-Dlo> for |C]o =df {(^, ^[x]) \ ^[C] JJ. A}. However, 
|-]o is a syntactical characterization rather than a semantical characterization 
which we will develop below. Note that we intend to achieve compositionality in 
the (declarative) sense of a fully-abstract semantics and not in the (constructive) 
sense of a denotational semantics. 

3 Macro-step Semantics via Stabilization Sequences 

We start off by investigating parallel configurations within parallel contexts. 
We propose a novel semantics for this fragment, show its relation to Pnueli 
and Shalev’s original semantics, and derive a full-abstraction result. Section 21 
generalizes this result to arbitrary configurations within arbitrary contexts. 

Our new interpretation of parallel configurations C, based on an “open- 
world assumption,” is given in terms of finite increasing sequences of “worlds” 
Eq C El C ■■■ C En- Each Ei 77 \ {_L} is the set of events generated or 
present in the respective world. The required absence of T ensures that each 
world is consistent. A sequence represents the interactions between C and a 
potential environment during a macro step. Intuitively, the initial world Eq con- 
tains all events e which are generated by those transitions of C that can fire au- 
tonomously. When transitioning from world to Ei, some events in Ei\Ei-i 
are provided by the environment, as reaction to the events validated by C when 
reaching 7fi_i. The new events destabilize world and may enable a chain 
reaction of transitions in C . The step-construction procedure, which tracks and 
accumulates all these events, then defines the new world Ei. Accordingly, we 
call the above sequences stabilization sequences. The overall response of C after 
n interactions with the environment is the set E^. 

The monotonicity requirement of stabilization sequences reflects the fact that 
our knowledge of the presence and absence of events increases within the con- 
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struction of a macro step. Each world contains the events assumed or positively 
known to be present. Only if an event is not included in the final world, it is 
known to be absent for sure; the fact that an event e is not present in a world 
does not preclude e from becoming available later in the considered stabiliza- 
tion sequence. This semantic gap between “not present” and “absent” makes the 
underlying logic intuitionistic as opposed to classic. 

Model-Theoretic Semantics for Parallel Configurations. Formally, a sta- 
bilization sequence M is a pair (n, V), where n G N \ {0} is its length and V is 
a state valuation, i.e., a monotonic mapping from the interval [0, . . . ,n — 1] to 
finite subsets of 7T \ {_L}. The final world V{n — 1) of M is denoted by M*. We 
shall assume that M is irredundant, i.e. — C V{i) for all 0 < z < n, and 
identify sequences (1, V) of length 1 with subsets E(0) U \ {T}. 

Definition 1. Let M = (n, V) be a stabilization sequence and C G PC. Then, 
M is a sequence model of C , written M \= C , according to the following clauses: 
(i) always M \= 0 ; (ii) M \= C\\D iff M \= C and M \= D; (Hi) M \= E/A iff 
{E C\ nr\V{n—l) = 0 and EdU C V(i)} implies AC V{i), for all 0 < i < n. 

Def. m is a shaved version of the standard semantics obtained when reading 
C G PC as an intuitionistic formula CHI, i.e., when taking events to be atomic 
propositions and replacing d by negation -la, concatenation of events and “||” 
by conjunction “A”, and “/” by implication “d”. An empty trigger, an empty 
action, and 0 are identified with true. Then, M ^ C iff C holds for the intu- 
itionistic Kripke structure M . In the sequel we write SM{C) for {M \ M \= C}. 

In our example C79 = b/a + b/ais step-congruent to C(g = b/a\\b/a (cf. 
Sec.E|) which may be identified with formula (-i6 D a) A (6 D a). In classic logic, 
Cyg is equivalent to the single transition C12 = -/a corresponding to formula 
true D a. As mentioned before, this is inadequate as both have different opera- 
tional behavior, since C(f\\ a/h fails in the empty environment whereas C12II a/b 
has step response {a, 6}. In our intuitionistic semantics, the difference is faith- 
fully witnessed by the stabilization sequence M = (2, E), where E(0) = 0 and 
V{1) = {a, b}. Here, M is a sequence model of C(g but not of C 12 . 

Characterization of Pnueli and Shalev’s Semantics. We now show that 
the step responses of a parallel configuration C, according to Pnueli and Shalev’s 
semantics, can be characterized as particular sequence models of C\ to which we 
refer as response models. The response models of C are the sequence models of C 
of length 1, i.e. subsets of II \ {T} that do not occur as the final world of any 
other sequence model of C except itself. 

Definition 2. Let C G PC. Then, M = (1,P) G SM{C) is a response model 
of C if K* = M* implies K = M, for all K G SM{C). 

Intuitively, the validity of this characterization is founded in Pnueli and Shalev’s 
closed-world assumption which requires a response to emerge from within the 
considered configuration and not by interactions with the environment. 
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Theorem 1. Let C G PC and E,A ^fin dL- Then, C A iff A is a response 
model of C \\ ■ /E. 

Thm. n provides a simple model-theoretic characterization of operational step 
responses; e.g., configuration a/a forces Pnueli and Shalev’s step-construction 
procedure to fail. Indeed, the only sequence model of a/ a of length 1 (and using 
only event a) is A = {a}. But A is not a response model since it is the final world 
oi K = (2,V) G SM{a/a) with P(0) =df 0 and P(l) =df A. Since d/a does not 
have any response model, it can only fail. As another example, consider a/b || b/a 
which possesses the sequence models (2,V), where P(0) =df 0 and P(l) =df 
{a, 6}, and (1,F'), where P'(0) =df 0. Only the latter is a response model, in 
accordance with causality. Thus, (a/&||6/a)lj.0is the only response. 

Full Abstraction. Sequence models also lead to a fully-abstract semantics for 
parallel configurations within parallel contexts. 

Theorem 2. Let C,De PC. Then, SM{C) = SM{D) iff'iR G PC'iE,ACfl^n. 
C\\Ri/EAtffD\\Ri/E A. 

Hence, sequence models contain precisely the information needed to capture all 
possible interactions of a parallel configuration with all potential environments. 

Characterization of Sequence Models. Of course, Thm. ^ does not mean 
that every set of stabilization sequences can be obtained from a (parallel) con- 
figuration. In fact, in intuitionistic logic it is known that in order to specify 
arbitrary linear sequences, nested implications are needed m- Configurations, 
however, only use first-order implications and negations. Their sequence models 
may be characterized by simple lattice structures which we refer to as behaviors. 

Definition 3. An A-behavior C, for A Cfi^ PI, is a pair {F, I), where F C 
and I is a monotonic function that maps every B G F to a set I{B) C 2^ such 
that B G I{B) and I{B) is closed under intersection, i.e., Bi,B 2 G I{B) implies 
Bid B 2 G I{B), for all B G F. Furthermore, C is called bounded, if A G F. 

It is not difficult to show that the pairs of initial and final states occurring 
together in the sequence models of C G PC induce a behavior. More precisely, 
if A is the set of events mentioned in C, then the induced A-behavior Beh{C) 
of C is the pair (F{C), I{C)), where 

F{C) =df {A C A I 3{n, V) G SM{C). V{n - 1) = E} 

I{C){B) =df {ECB\ 3(n, V) G SM{C). F(0) = E and V{n - 1) = B} . 

Note that the response models B oi C are precisely those B G F(C) for which 
I{C){B) = {B}. As desired, we obtain the following theorem. 

Theorem 3. WC,Dg PC. Beh{C) = Beh{D) iff SM{C) = SM{D). 

In conjunction with Thm. El it is clear that equivalence in arbitrary parallel 
contexts can as well be decided by behaviors: Beh{C) = Beh{D) iff Vi? G PC 
HE, A Cgn LI. C\\R JJ-B A iff i?||i? U-e A. In contrast to SM{C), however, Beh{C) 
provides an irredundant representation of parallel configurations: 
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Theorem 4. C is a (bounded) A-behavior iff there exists a eonfiguration C £ PC 
over events A (not using A) such that C = Beh{C). 



Summarizing, behaviors Beh{C), where C £ PC, yield 
a model representation of SM{C). For each B in F{C), 
the set I{C){B) is a (Cl, C) semi-lattice with maximal 
element As a very simple example, consider C =df 
bc/a\\ ac/b\\d/a\\b/b\\c/c over events A = {a,b,c}. Its 
corresponding bounded A-behavior Beh(C) is depicted in 
Fig. 0 Since F{C) = {A}, we only have the (fl, C) semi- 
lattice /(C) (A). Generally speaking, SM{C) is the set of se- 
quences whose world-wise intersection with A are paths in 
the lattice diagrams ending in maximal elements. Moreover, 
the maximal elements are the classic solutions of C which 
may become actual responses in suitable parallel contexts. 




Fig. 2. Bounded 
{a, b, c}-behavior 



4 Generalizing the Full-Abstraction Result 

In this section we reduce the problem of full abstraction for arbitrary configura- 
tions in arbitrary contexts to that for parallel configurations in parallel contexts. 

Reduction to Parallel Contexts. For extending the full-abstraction result 
to arbitrary contexts, one must address a compositionality problem for -|- which 
already manifests itself in Pnueli and Shalev’s semantics. Consider configurations 
C =df d/b and D =df d/b\\a/a which have the same responses in all parallel 
contexts. However, in the choice context <d>\x] = (-/e-|-a::)||- /a we have ^\D] IJ. {a} 
but ^[C\ {a} (as IJ. {a, e} only). This context is able to detect that D 

is enabled by the environment -/a while C is not. Hence, one has to take into 
account whether there exists a transition in C that is triggered for a set A of 
events. To store the desired information we use the triggering indicator p{C, A) £ 
B =df {if, tt} defined by p(C, A) =df tt, if triggered (C, A) 0, and p(C, A) =df ff, 

otherwise. 

Lemma 1. Let C,D G C. Then C ~ D iff'iP £ PC^ A Cfi^ n,b gM. (C||P JJ. A 
and p{C, A) = b) iff {D\\P IJ. A and p{D^ A) = b). 

Thus, to ensure compositionality for arbitrary contexts we only need to record 
Id'i =df {(A,p) I C||P II A, p(C, A) = b,PG PC}, for b G B, instead of |Clo. 
We may view |C] f as the collection of active and |C]-f as the collection of passive 
responses for C in parallel contexts, according to whether a transition of C takes 
part in response A. By Lemma QJ C ~ P> iff ICJf = \D\f and {C\^ = {D\f. 

Reduction to Parallel Configurations. For eliminating the choice operator 
from configurations we employ a distributivity law. However, the naive distribu- 
tivity law C D for C =df (b + ^2)11^3 and D =df (ti\\t'^) + (^211^30, where 
transitions t'^ and t'^ are identical to t^ except for their name, does in general 
not hold. Consider f =df Uibi/ci, for 1 < z < 3, and assume that all events are 
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mutually distinct. Then, in a context in which t2 is enabled but not ti, transi- 
tion in C is forced to interact with t2, while in D transition tg may run by 
itself in the summand tiUts- E-g-, if if = { 02 , 03 } then D {03,02,03}, but 
the only A with C3 G A and C {Is A is A = {c2, C3, 02, 03}. 

The naive distributivity law can be patched by adding configurations Di(t^) 
and D2{tz) such that C ~ ti\\Di{t^) + t2\\D2{t^). Here, Di{t^) must weaken 
such that it disables whenever ti is not enabled but is. A simple way 
to achieve this is to define =df i^iPs and D 2 {tz) =df ii 2 ||t 3 ^ where 

Di =df aia'i-ib^-i/ 1- 1| htaz-ibs-i/ for i G {1,2}. As desired, the “watchdog” 
configuration Di satisfies for all parallel contexts P: Di\\P {1 A iff (i) P JJ. A and 
(ii) A triggers p or does not trigger ts-i. It should be clear how this can be gen- 
eralized, i.e., how one constructs for any C, D G C a configuration watch(C, ii) 
such that P||watch(C', P) {1 A iff (i) P {1 A and (ii) triggered (C, A) 0 or 
triggered (P, A) = 0. 

Lemma 2. Let Ci,C2, D G C. Then, (Ci -I- C 2 ) ||P — {watch{Ci,C2)\\Ci\\D) + 
{watch{C2,Ci)\\C2\\D). 

The fact that we have available an explicit failure event T makes this distribu- 
tivity law particularly simple. The use of T, however, is inessential as it can be 
eliminated ca. Now, by repeatedly applying distributivity we may push occur- 
rences of operator -|- to the outside of configurations. 

Lemma 3. Let C G C. Then, there exists a finite index set ind(C) and Ci G PC, 
for i G ind{C), such that C ~ J 2 i£ind(C) 

Hence, |C]i = Eieind(C) Moreover, since an active response of a sum must 
be an active response of one of its summands and since a passive response of 
a sum always is a passive response of all of its summands, IX^ieind(C) ~ 

U^eind(C)IC*lf and EzGind(C) = riiGind(C) hold. Thus, we obtain: 
Lemma 4. Let C,D G C. Then, C - D iff U*Gmd(C) 1^*1? = Uje/nd(D) Pjlf 
O-n-d r\jeind(D)l^3il ■ 

Full-abstraction Result. Now, we are able to use our analysis of Sec. 0 to 
phrase Lemma 0 in terms of behaviors. All we need to do is to replace the 
parallel configuration P G PC in every pair (A, P) G |Ci]i, for i G ind(C), by 
its behavior Beh(P). It turns out that the pairs obtained in this way can be 
uniquely determined from the behavior Beh{Ci) of Ci, for any i G ind(C'). 

Definition 4. Let A Qfin P. An A-behavior {F,I) is called an A-context for 
C GPCif (i) F = {A}, (li) A G F(C), and (Hi) I{A) n /(C)(A) = {A}. 

Note that A-contexts for C are bounded behaviors, i.e., they can be represented 
without T. An A-context V of C represents a set of sequences that all end in 
the final world A, in which also some sequence model of C must end and which 
only have world A in common with the sequence models of C ending in A. These 
properties imply C\\P 1} A, for every P with Beh{P) — V. Hence, A-contexts P 
are “relativized complements” of C wrt. the final response A. 
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Consider again example C from above, whose 
sequence models SM{C) are described by the be- 
havior of Fig. 121 To get the A-contexts of C, where 
A = {a,b,c}, we must take the “complement” 
of I{C){A), i.e., all B C A that are missing in the 
lattice of Fig. 0 As shown in Fig. El C has two A- 
contexts Vi and V 2 covering this complement; con- 
figurations that denote them are P\ =df -/acWh/b 
and P 2 =df -/^c II a/a, respectively. These provide 
complete information since every A-context must be 
contained in Vi or V 2 - For all C G PC and 6 G B we 
are finally led to define \C\\ =df I ^ ^fin n, 

p{C,A) = b, V is A-context of C} and obtain as a 
corollary to Lemma El and Thm. 0 



{a,b,c} 

{a,c} {a,b} {b,c} 

{a} {b} {c} 

{"} 



{a,b,c} {a,b,c} 

2} I 1 

{a,c} {b,c} 

Fig. 3. {a, b, c}-contexts 



Theorem 5. Let C,D £ C. Then, C - D iff UiG/nd(C) = UiG/nd(D) 
O-n-d- C\ieind{C)l^i^2 = r\jeind(D)l^3i2 ■ 



With Thm. 0 we have finally achieved our goal, as |C ]2 is satisfactorily se- 
mantical and finite. In combination with Lemma |2| it directly lends itself to 
be applied for a model-based implementation of Pnueli and Shalev’s seman- 
tics, which does not require backtracking for handling failure. Finally, it should 
be stressed that the above theorem also holds if we restrict ourselves to “-|-”- 
configurations of the form C + t, as in Statecharts, instead of permitting configu- 
rations C + D, for arbitrary C,D £ C (cf. Sec.|2|). We now return to the example 
of Figs. 0 and 0 Let ids be the behavior {{B},B 1 — >■ {i?}), for B £-11. We have 



\C \2 = {({a, b, c}, Vi) I z = 1, 2} and = 0- The same semantics can be gen- 
erated as Di + D2, where Di = be/ a || b/b || a/ a and D2 = ac/b || b/b || c/c, since 
[AM* = {{{a,b,c},V^)}, [L>i]f = {({a, 5 },id{a,b})}, [L>2lf = {({ 6 , c}, id{b,c})}. 
Hence, U P2I? = and [Di]f n p2lf = 0 = ICg. By Thm. 0 

C ed Di + D2- a similar reasoning reveals Cyg ~ C/g (cf. Sec. 0). 



5 Discussion and Related Work 

Our investigation focused on Pnueli and Shalev’s presentation of Statecharts and 
its macro-step semantics. The elegance of their operational semantics manifests 
itself in the existence of an equivalent deelarative fixed point semanties HB|. 
However, as illustrated in |1 bj . this equivalence is violated when allowing dis- 
junctions in transition triggers. For example, the configurations (a V b)/a and 
a/a II b/a do not have the same response behavior. This subtlety can now be 
explained in our framework. In Pnueli and Shalev’s setting, a V b is classically 
interpreted as “throughout a macro step, not a or 6.” In contrast, this paper 
reads the configuration as “throughout a macro step not a or throughout 6.” 
Our framework can also be employed for analyzing various other variants of 
Statecharts semantics, such as the one of Maggiolo-Schettini et al. m which in 
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turn is inspired by the process-algebraic semantics presented in |17| . In |14| the 
step-construction procedure cannot fail since a transition is only considered to 
be enabled, if it is enabled in the sense of Pnueli and Shalev and if it does not 
produce any event that violates global consistency. As an example, consider the 
configuration C =df ti||t 2 , where =df a/h and t 2 =df h/a. According to |T1). 
when C is evaluated for the empty environment, response {a} is obtained; in 
Pnueli and Shalev’s semantics, however, the step construction fails. The dif- 
ference can be explained in terms of stabilization sequences. While Pnueli and 
Shalev take ti to stand for the specification a D b and t 2 for -•b D a, Maggiolo- 
Schettini et al. apply the interpretation a D (6 V -'b) for ti and -■6 D (a V -la) 
for t 2 - Indeed, as one verifies, {a} then is a response model of ti\\t 2 - Note again 
that a V ->a is different from true in intuitionistic logic. Generalizing this exam- 
ple, the transition semantics of can be captured in terms of response models 
by reading a transition E/A as formula A D (A V ^A), if our setting would be 
extended to allowing disjunctions as part of actions. 

Our intuitionistic approach is also related to recent work in synchronous 
languages, especially for Berry’s Esterel 0. In Esterel, causality was traditionally 
treated separately from compositionality and synchrony as part of type-checking 
specifications. If the (conservative) type checker found causality to be violated, 
it rejected the specification under consideration. Otherwise, the specification’s 
semantics could be determined in a very simple fashion; one may — in contrast 
to Statecharts semantics — abstract from the construction details of macro 
steps while preserving compositionality, as shown by Broy in Version 5 of 
Esterel (2| replaced the treatment of causality by defining a semantics via a 
particular Boolean logic that is constructive, as is intuitionistic logic. 

Denotational semantics and full abstraction were also studied by Huizing et 
al. pi()lll| for an early and later-on rejected Statecharts semantics 0. That se- 
mantics does not consider global consistency, which makes their result largely 
incomparable to ours. Finally, it should be mentioned that the lack of compo- 
sitionality of Statecharts semantics inspired the development of new languages, 
such as Alur et al.’s communicating hierarchical state machines 



6 Conclusions 

To the best of our knowledge, this is the first paper to present a fully-abstract 
Statecharts semantics for Pnueli and Shalev’s original macro-step semantics m- 
The latter semantics is non-compositional as it employs classic logic for inter- 
preting macro steps. In contrast, our semantics borrows ideas from intuitionistic 
logic. It encodes macro steps via stabilization sequences which we characterized 
using semi-lattice structures, called behaviors. Behaviors capture the interactions 
between Statecharts and their environments and consistently combine the no- 
tions of causality, global consistency, and synchrony. Moreover, our approach sug- 
gests a model-based implementation of Pnueli and Shalev’s semantics, thereby 
eliminating the need to implement failure via backtracking. 
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Abstract. We extend the algebraic approach of Meseguer and Monta- 
nari from ordinary place/transition Petri nets to contextual nets, covering 
both the collective and the individual token philosophy uniformly along 
the two interpretations of net behaviors. 



Introduction 

Among the models for concurrency, place/transition Petri nets (pt nets), intro- 
duced by Petri in (see also US!), are one of the most largely diffused, with 
many interdisciplinary applications. The reasons of the success of the net model 
probably reside in the simple formal description and natural characterization of 
concurrent and distributed systems: the state of a system consists of a (multi)set 
of distributed resources, its actions consume some of the resources available and 
release fresh resources, thus affecting only local subsystems. In particular, a com- 
putation can be described as a partial order of events such that two events in the 
same computation are either causally dependent - when one could not have been 
executed without a resource provided by the other - or concurrent - when they 
could have happened in any order, because they affect independent subsystems. 

Several extensions of the basic net paradigm have been considered in the lit- 
erature that either increase the expressive power or give a better representation 
of existing phenomena. This paper focuses on contextual nets, also known as nets 
with read arcs, or condition arcs, or test arcs | ^I1!II8I2T| . The underlying idea 
is that of reading resources without consuming them, thus providing a way of 
modeling multiple concurrent accesses to the same resource. With ordinary pt 
nets such readings must be rendered as self-loops, and this imposes an unfortu- 
nate sequentialization of concurrent readings. On the contrary, with contextual 
nets, besides pre and post-sets transitions also have ^contexts’, that is resources 
that are necessary for the enabling, but not affected by the firing. Contextual 
nets have found applications e.g., to transaction serializability in databases m, 
concurrent constraint programming H2|, and asynchronous systems |2D|. 

The extensive use of pt nets has given rise to different schools of thought 
concerning their semantic interpretation. In particular, the main distinction is 
drawn between collective and individual token philosophies (see e.g. [I t))V 
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mi Eterogenei', by Esprit Working Groups CONFER2 and COORDINA-, and by 
MURST project TOSCA. 
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Fig. 1. 



According to the collective token philosophy (CTph), one should not distin- 
guish among different tokens in the same place (i.e., among instances of the 
same resource), because all such tokens are operationally equivalent. This view 
disregards that tokens may have different origins and histories and may, there- 
fore, carry different causality information. Selecting one instance rather than 
another, can make the difference from being causally dependent or not on some 
previous event. And this may well be a piece of information one does not want to 
discard, which is the point of the individual token philosophy (ITph). Of course, 
causal dependencies may influence the degree of concurrency in computations, 
and therefore CTph and ITph lead to different concurrent semantics. 

Independently of CTph and ITph, for contextual nets several different ap- 
proaches have been proposed that differ for the way in which contexts are read. 
For example, let us consider the nets Ni, N2 and A3 in Figure^taken from m 
(As usual, places are represented by circles, tokens by black bullets, transitions 
by boxes, pre- and post-sets by directed weighted arcs, and contexts by undi- 
rected weighted arcs, with unary weights always omitted.) According to m, the 
transitions to and t\ can fire concurrently in Ni, but neither in N2 nor in N3, 
since the basic assumption is that a token cannot be read and consumed in the 
same step. In [ 3 , instead, the concurrent step is allowed for all three nets, the ba- 
sic assumption being that to and ti can both start together and read the context 
tokens, without needing them while the actions take place. Besides its possible 
merits, we find this interpretation not fully convincing as, for instance, in we 
would end up in a state that cannot be reached by any firing sequence. The basic 
assumption of izq that firings have duration leads to consider ST-traces, where 
explicit transition- starts and transition- ends events are fired. Hence N2 can start 
to and then ti before to completes, allowing the concurrent step On the 

contrary, in if either to or ti starts, then the context for the other transition 
is consumed and the concurrent step is forbidden. In this paper, we follow the 
interpretation of H 3 | that fits better our understanding of contexts. 



Collecting Tokens. The seminal paper m proposed an algebraic approach to 
the analysis of net behaviors relying on the basic observation that the monoidal 
structure of pt net states (i.e., the markings) can be lifted to the level of com- 
putations so to obtain an algebraic initial model for concurrent net behaviors 
according to the CTph. 
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The algebraic net theory developed under the CTph is well consolidated, and 
the relationships between its computational, algebraic and logical interpretations 
are by now very clear Starting with the classical ‘token-game’ semantics, 
many behavioral models for Petri nets have been proposed that follow the CTph. 
In particular, the commutative processes of Best and Devillers |21 reconcile the 
‘diamond’ equivalence on firing and step sequences, and express very nicely the 
concurrency of the model. They also admit an exact algebraic representation 
by means of the universal construction T(_) that yields strictly symmetric strict 
monoidal categories from the category of pt nets. More precisely, given a pt net 
N, the objects of T{N) are the elements of the free commutative monoid over 
the set of places, its arrows correspond to the commutative processes of N 1 1 Dfb] . 

Surprisingly, the CTph semantics for contextual nets have received poor at- 
tention in the literature. Whether because the problem has been underestimated, 
or simply because the ITph is more fascinating, we cannot tell. In any case, we 
think that it is useful to remove this discrepancy with the semantics of ordinary 
PT nets. Moreover, although one can easily extend the diamond equivalence to 
firing sequences on contextual nets, the formalization of a good algebraic model 
is not at all straightforward. Inspired by a suggestion made by Meseguer in |0|, 
we give here a fully satisfactory treatment of this issue. The idea is to consider 
monoidal categories with a commutative tensor product taken - differently from 
the case of pt nets - over a non-free monoid of places. In particular, we regard 
each token a as an atom that can emit several ‘negative’ particles a~, while keep- 
ing track of the number of electrons around, i.e., as in P|, we assume that for 
all fc £ N, a = ® /c • a”, with a shorthand for (+ applied k times). 

Replacing context arcs on a with self- loop arcs on a”, we are able to give an 
axiomatic construction of a monoidal category whose arrows between standard 
markings (i.e., containing no negative particles) are (isomorphic to) the concur- 
rent computations of the net according to the CTph. A key ingredient for this 
result to hold is the so-called maximum sharing hypothesis, an axiom express- 
ing that concurrent readings can always be seen as sharing the same token, a 
fundamental idea in CTph. 

Observing Causal Dependencies. Building on the notion of process intro- 
duced by Goltz and Reisig in |2|, several authors have shown that the semantics 
of nets in the ITph can still be understood in terms of symmetric monoidal 
categories. In particular, a simple variation of Goltz-Reisig processes called con- 
catenable processes is introduced in jS| (see also HH), which admits sequential 
composition and yields a symmetric monoidal category V{N) for each net N. 
Also several unfolding semantics (see e.g. I22EU) have been proposed that give a 
denotational interpretation of the interplay between concurrency, causality and 
nondeterminism . 

For contextual nets both the process and the unfolding approaches have been 
studied UM], giving a satisfactory understanding of the computational model 
via the introduction of asymmetric event structures. The algebraic approach, 
however, has been pursued only in a recent paper by Gadducci and Monta- 
nari jS| using match-share categories. There, the basic idea is that, together 
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with symmetries, two additional auxiliary constructors must be present: one for 
duplicating tokens and one for matching them. Read arcs can then be replaced 
by self-loops, and reading without consuming modeled by duplicating the con- 
text, firing the transition concurrently with an idle copy of the context, and 
then matching the idle copy with the corresponding produced tokens. Multiple 
concurrent access is achieved by producing via duplication - and then absorbing 
via matching - enough copies of the context. In a suitable axiomatization of 
duplicators and matchers is introduced and proved to represent faithfully the 
basic fact about concurrent access: steps sharing the same context, but other- 
wise disjointly enabled, can execute concurrently or in any interleaved order with 
no noticeable difference. The main problem of this approach is that the initial 
model contains too many arrows and, therefore, in order to obtain a bijection 
with contextual processes one has to carve a suitable subcategory. Although the 
arrows of this subcategory can be characterized by inspecting their structure, 
the lack of a global correspondence somehow weakens the framework. 

We aim at improving the approach of 0, by noticing that unwanted arrows 
are due to redundant information in the model. In fact, once a context token 
is read by a transition we know the ‘real’ token it is connected to: the one 
duplication was applied to. Hence, the match operation, needed for expressing 
concurrent readings, does not add any further information and may introduce 
inconsistent behaviors. For example, given two tokens in the place a, one can 
first duplicate both and then match each copy of the first token with a copy of 
the second token: The resulting arrow is meaningless from the computational 
viewpoint. We overcome this problem by extending to the ITph the approach 
proposed in the first part of the paper for the CTph. In particular, besides a* 
and a~ we introduce the term a~ for each place a, with a = ® k ■ a~. 

Each context arc from a to t is then replaced by putting a~ in the source of t 
and a_ in the target of t. This is necessary to avoid that contexts released by a 
transition be consumed by another transition, and represents, in the ITph, a sort 
of dual to the maximum sharing hypothesis. Then, we introduce symmetries on 
markings, but regulate their use on the a’’', a~ and a~ as to forbid the swapping of a 
a* and an adjacent a~ or a_. This is actually the key of our proposal, as it prevents 
that electrons may migrate from atom to atom, which is essentially what happens 
in |S|. We impose this restriction by omitting the corresponding symmetries. 
Putting such arrows back in the model would in fact result in a redundant 
framework perfectly analogous to the one of match-share categories. Our main 
result is that, again, the arrows between standard markings are in bijection with 
a slight refinement of contextual processes, called strongly concatenable. 

Structure of the Paper. In Section ^ we recall some basics about contextual nets 
and the algebraic semantics of pt nets. In Sections|2|and|3we define algebraic se- 
mantics for contextual nets under both the CTph and the ITph, providing original 
characterization results for commutative and strongly concatenable contextual 
processes. We remark that in the absence of read arcs, our semantics coincide 
with the classical ones. 
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1 Preliminaries 



1.1 Contextual Nets 



Contextual nets were introduced for extending pt nets with the ‘read without 



consume’ operation 



smism 



I . The states of contextual nets are called markings 
and represent distributions of resources {tokens) in typed repositories {places). 
Given the set of places S, markings can be seen as multisets m: 5* — >■ N, where u{a) 
denotes the number of tokens that place a carries in u. The set of finite multiset 
on (S' if a free commutative monoid on S. We denote it by 5"®, and indicate 
multiset inclusion, difference and union by C, 0 and 0, respectively. For k a 
natural number and u a multiset, k-u is the multiset such that {k-u){a) = k-u{a) 
for all a. We denote by [uj the underlying set of it, that can be seen as the 
multiset such that [itj(a) = 1 if u{a) > 0 and [itj(a) = 0 otherwise. 



Definition 1. A contextual net N is a tuple {S,T,do,di,g), where S is the set 
of places, T is the set of transitions, do,di\T — >■ S® are the pre and post-set 
functions, and <;:T ^ S® is the context function. 



Informally, do{t) 0<j(f) is the minimum amount of resources that t requires to 
be enabled. Of these resources, those in do{t) are retrieved and consumed, while 
those in c(t) are just read and left on their repositories. When t has accomplished 
its task, it returns di{t) fresh tokens and releases the context. Only at this point 
other transitions will be able to consume the tokens in whereas they can 
use the same context concurrently with t. Besides the usual assumption that <;{t) 
and do{t) 0 di{t) are disjoint for each transition t, we assume that g{t) is a set. 



Definition 2. Let u and v he markings, and X a finite multiset of transitions 
of a contextual net N = {S,T, do, di,i;). We say that u evolves to v under the 
step X , in symbols u [AT) v, if the transitions in X are concurrently enabled at 
u, i.e., ■ '9o(i) Cm, and 



U = U0 0 X{t) ■ do{t) I 0 0 X{t)-di{t). 

\teT ) teT 

A step sequence from uq to Un is a sequence uq [Xi) ui . . .u„_i [X^) m„. 

Thus the execution of the step X requires that the marking u contains at least 
all the tokens in the preconditions of transitions in X plus at least one token 
for each place that is used as context by some transition in X. This matches 
the intuition that a token can be used as context by many transitions at the 
same time. From the concurrent point of view, the fact that transitions in X 
are executed in a step means that they can be equivalently executed in any 
order. Thus, likewise ordinary pt nets, step sequences for contextual nets can be 
considered up to the equivalence induced by the diamond transformation relation 
_ o _ defined by ii [X 0 T) v o u [X) ui [X) v for any step u [X 0 X) v (and 
suitable ui). The diamond equivalence is the reflexive, symmetric, transitive and 
sequences concatenation closure of the relation _ o _. 




180 



R. Bruni and V. Sassone 



Definition 3. Given a contextual net N , the strictly symmetric strict monoidal 
category of contextual commutative processes CCV{N) has the markings of N as 
objects and its step sequences, taken modulo the diamond equivalence, as arrows. 

In the ITph computations are commonly described in terms of structures rep- 
resenting the causal relationships between event occurrences. In the case of nets, 
this is fruitfully formalized through the following notion of process. We remark 
that these notions are conservative extension of the corresponding notions for 
ordinary pt nets, to which they reduce in the absence of read arcs. 

Definition 4. A contextual process net is a finite, acyclic (w.r.t. the preorder 
in which t precedes t' if either di{t) fl {do{t') U <;{t')) ^ 0 or <;{t) fl do{t') ^ 0) 
contextual net 0 such that (1) for all t G Tq, do(t) and di{t) are sets (as opposed 
to multisets), and (2) for all pairs to ^ ti G Tq, di{to) C\di{ti) = 0, for i = 0,1. 

Definition 5. A contextual process tt of a contextual net N consists of a con- 
textual process net 0 together with a pair of functions {ttt, tts), where ttt'. le 
Tpf and tts'.Sq — >■ Sn, that respect source, target and context, i.e., such that 
Sni o TTrp = tts o doi, for j = 0, 1, and ° ° ‘^0- Contextual processes 

are considered up to isomorphisms. 

1.2 Petri Nets Are Monoids 

The paper uni exploited the monoidal structure of markings to provide an al- 
gebraic characterization of the concurrent computations of nets. The basic idea 
was to lift the structure of states to the level of transitions, providing an al- 
gebraic representation of concurrent firing. In turn, these ‘algebraic’ steps can 
be sequentially concatenated in order to express more complex computations. 
Since sequential composition endows computations with a categorical structure 

- markings are objects, computations are arrows, and idle tokens are identities 

- the parallel composition yields a tensor product. The interplay of parallel and 
sequential composition regulated by functoriality of tensor products models a ba- 
sic fact about concurrency, namely that concurrent transitions can occur in any 
order. Under the CTph the tensor product can simply be commutative. Then, 
each PT net N freely generates a strictly symmetric strict monoidal category 
T{N) whose arrows are in bijection with the commutative processes of N 0. 

Under the ITph the situation is more complex. In order to be able to model 
causal dependencies one cannot consider multisets of transitions. The proposal 
of Degano, Meseguer and Montanari was to introduce a non commutative tensor 
product - while keeping markings as objects - together with suitable arrows 
for exchanging the order in which transitions fetch and produce tokens jS| ■ Such 
arrows are called symmetries, and are formalized categorically as the components 
of a natural isomorphism. This approach leads to the construction of a (non 
strictly) symmetric strict monoidal category V{N) for each net N , whose arrows 
define the concatenable processes of N. A more concrete construction, Q{N), was 
introduced in m in order to remove some deficiencies of the previous approach. 
The main feature of Q{N), which captures the so-called strongly concatenable 
processes, is that its objects are strings rather than multisets of tokens. 
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2 Collective Contexts 

As explained in the Introduction, we build the algebraic theory over a non-free 
monoid of places. In particular, apart from the commutative monoidal operation 
- 0 - with unit 0, we consider other two operations (_)'^ and (_)” that are ax- 
iomatized as in Figure 0 where we omit the usual associativity, commutativity 
and unit axioms for _ 0 These mean precisely that (_)'^ and (_)” are monoid 
homomorphisms such that (_)'^ 0 (_)“ = id, (_)'^ o (_)“ = (_)“, and (_)“ o (_)“ = 0. 
Observe that (6) actually follows from (1), (7) and (5). 

By these laws we can always eliminate consecutive applications of (_)'^ and 
(_)”, except for sequences of (_)'^. We shall write as a shorthand for (_)'^ applied 
k times to u and omit the parentheses. We assume = u, but we remark that 
in general ^ u. We call molecules the elements of this algebra. Given a 

set S, we let fi{S) denote the set of molecules generated by S. 

Lemma 1. For any natural number k and molecule u we have {u^)~ = u~ . 

Proof. By induction, applying law (6). 

Proposition 1. For any natural k, and molecule u, we have 0 u~ 

Proof. By law (1), we have = {u’^Y 0 (w^)~, and (u^)“ = u~ by Lemmas 

Corollary 1. For any natural k and molecule u, we have u = ®k ■ u~ . 

Of course we are interested in molecules centered on the places, these can be 
of two forms, either or a”. From the computational point of view, the a~ are 
the basic contexts, which carry very little information, since the nucleus can 
produce as many of them as needed. To understand this point, one can think of 
the tokens as ticket rolls with unbounded number of tickets available. Readers 
just take a ticket and return it after the use for recycle, whereas consumers must 
retrieve the entire roll. 

Definition 6. Given a contextual net N = (S,T,do,di,i;), we define the cat- 
egory M{N) as the category with objects the molecules on S and with arrows 
generated from the rules in Figure 0 modulo the axioms of strictly symmetric 
strict monoidal categories in Figure^ 

We can now characterize contextual commutative processes algebraically. 
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u £ 

idu \u^u 



t gT, do{t) = u, di{t) = V, ?(t) = w 
t-.u(Bw~^v(Bw~ 



a-.u^v, p-.v 
a\P\u^w 



a:u->-v, p-.w^z 
a(BP'-u(Bw^v®z 



«; {P\l)={a\P)\l 
a © (/3 © 7)=(a © /9) © 7 
(a; P) © ( 7 ; 5)={a © 7 ); (/3 © <5) 



Fig. 3. 



a; idv=idu\ a = a 
a © P=P © a 
'idu(^v—'idu © id-u 



a © idg; = a 



Fig. 4. 



Theorem 1. The category CCV{N) is isomorphic (via a monoidal functor) to 
the full subcategory of M{N) whose objects are S'®. 

A very important property needed in the proof is what we call the maximum 
sharing hypothesis, that can be expressed as below. This result contains the core 
of the CTph, since it shows that whenever two or more tokens in the same place 
a are used as contexts, we can always find an equivalent computation where only 
one token is used (twice or more) as context. 



Proposition 2. For any molecule u and natural numbers k and n, we have 
u" © © u. 

Proof. By Corollary Q we have © u = © fc • u” . By commutativity 

(and associativity) of _ © _ we get u"®^ © u = m”®^ © fc • By applying 

Proposition Q] A: times we have the result. 



N 




For instance, let us consider the net N in Fig- 
ure 0 In Ai{N) we have three basic arrows 
to: a © c” — >■ c”, ti: 6 © c” — >■ c” and t2'. c 0 , 
but neither tg, nor ti can represent a commuta- 
tive contextual process, since their sources and 
targets are not elements of S'®. To remedy this, 
we must put tg and ti in an environment where 
the c” become instances of a ‘complete’ token, as 
(B tg: a (B c —> c and id^+ © ti. The concur- 
rent execution of tg and t\ with shared context 
is instead written as id^z © to © f 1 • By the functoriality of 
idcz © to © ti = (*dc+ © to ©tdb); {id^+ © ti) = {id^+ © ti (Bida)', {id^+ ©to), (recall 
that idc2 (Bid^- = id^*), i.e., tg and t\ can execute in any order. Also interesting 
is to observe that {id^+ © tg) © ((idc+ © ^i); h) = ((*dc+ © ^o); h) © (idc+ © ti), 
i.e., we have no causal information about the token consumed by t2'. is it the one 
read by tg, or the one read by ti? 



Fig. 5. 



we have that 
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3 Individual Contexts 

The maximum sharing hypothesis creates obvious problems when dealing with 
the ITph, whose entire point is to be able to recognize how electrons are emitted 
from tokens. For ordinary pt nets, the information about causality is recovered 
in the algebraic setting by using (non strictly) symmetric strict monoidal cate- 
gories, i.e., by introducing symmetries for controlling rearrangements of tokens 
when composing processes. At the level of states we still have standard markings. 
At the level of computations (arrows), however, the tensor product is not com- 
mutative anymore, so that we are able to interpret correctly the flow of causality 
through token histories. Thus, the first attempt to a uniform extension of the 
CTph treatment of the previous section is to introduce symmetries on molecules. 

There is however another problem to solve. Since the context <;{t) is modeled 
by a self- loop on c(t)~, two transitions with the same context can be concatenated 
on it, as if one depended on the execution of the other. This spurious causal 
dependency is to be avoided, as it gives rise to a wrong semantic model. We thus 
choose to introduce a new kind of electrons, denoted by u- for representing used 
(i.e., read) contexts. A transition t consume a forward copy of its context <;(t)~ 
and produces a backward copy <^{t)~ that cannot be read by other transitions. 
We call bimolecules the (generalized) markings of the algebra that includes also 
the operator (_)_ subject to a set of axioms formally identical to those involving 
(_)” in Figure I2I Given a set S, we write n{S) for the set of bimolecules on S. 

The final and key ingredient in our construction is to abandon the symmetry 
of the monoidal categories involved. In a step similar to the one that brought 
from strictly symmetric to symmetric categories, we choose {non symmetric) 
monoidal categories to which we adjoin exactly and only the symmetries we need. 
In this way, we are able to omit those symmetries that would cause migration of 
electrons from atom to atom. In the following we shall build on the construction 
Q{N) for PT nets and, therefore, take a non commutative monoid of objects. 
We use the symbol (8> for the monoidal operation, which essentially amounts to 
string concatenation. Given a string q, we denote by p{q) its underlying multiset. 

Definition 7. Given a contextual net N = {S^T,dQ^d\,G), we define the cate- 
gory B{N) as the category with objects the bimolecules on S and with arrows gen- 
erated from the rules in Figure]^ together with the symmetries a^®b^ — >■ 

b^®a^ and — >■ , for a,b,c £ S with a ^ b, for x, k G NU{", _}, 

and for S,e £ {“, -}. The arrows are taken modulo the axioms of strict monoidal 
categories in Figure^ (whenever the 7’s are defined) and the laws: 

^ — Ip' ( 9 ) 
la£a^ = idaS^a^, for S £ {', -} ( 10 ) 

for any symmetries s:p' — >■ p and s': q — >■ q' , and any transition t: p{p) — >■ p{q). 

Note that we do not introduce the symmetries 7^*, and 7afc_a_ that would 
allow the electrons to flow from a nucleus to a different one. For example, starting 
from a®a = ®a~ and applying the arrow o'^G7a__a+®fl”) "''^6 would reach 
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p G v{S) t ^ T, do{t) © — p{p), di{t) © ^(i)_ — p,{q) p,q ^ S® 

idp-.p^p tp^q-.p^q lp,q'-P ® q ^ q® p 



a: p q, (3\q- 
Oi', j3\ p —¥ r 



cx.\p q, (3: p' q 
OL ® (3: p ^ p' ^ q ® q' 



Fig. 6. 



a; (/3; ct) = (q:; /3); <t ol\ idq—idp\ a — a. (cn; /3) © {ol' \ ^') — {<y. © a'); (/3 © /3') 

a © (/3 © g) — {ol © /3) © (T a 0 id 0 —id 0 0 ct — a. idp^q—idp 0 idq 

(a © /?); 7q,q'^7p,p/; (/? <S> a) lp,q’,lq,p^idp © idq 7p,g®r^(7p,q 0 ^c/r); (*dq © 7p,r-) 



Fig. 7. 



a* 0 a* 0 a~ 0 a~ = a* 0 a0 a~, which is problematic. In fact, our representation 
invariant is that the electrons associated to a certain nucleus in a string q are 
the first k electrons (either a~ or a~) that appear in q to the right of . Thus, for 
consistency, we want exactly k electrons between and the successive nucleus 
a" occurring in q. The absence of those symmetries maintains this invariant. 

As for Q{N) in pH]j we introduce an arrow tp^g for all the possible lin- 
earizations p and q of the source and target of each transition t of N. Law (0, 
considered originally in im, establishes a link between all the instances of a sin- 
gle t, guaranteeing both consistency and a sensible computational interpretation 
for such arrows. Actually, @ expresses that the collection of the instances of t 
forms a natural transformation between suitable functors. The reader is referred 
to 221 for a thorough discussion of this topic. Laws (II OH make the instances of 
electrons associated to the same nucleus indistinguishable from each other by 
asserting that the order in they are used is immaterial. 

To establish our representation result we need to refine contextual processes 
in order to be able to concatenate them. As for similar cases in the literature, 
this leads to introduce an ordering of the tokens in the source and target of the 
process net, yielding the notion of strongly concatenable contextual processes. 

Definition 8. Given a net N, a strongly concatenable contextual process is a 
tuple (tt, 0, ^0) ^i)) where tt is a contextual process with underlying contextual 
process net 0, total orders on the minimal and maximal places 

of O, respectively, such that a<ob (resp. a b) implies TTs{a) = TTs{b). 

Likewise concatenable processes, a partial operation of sequential composition 
can be defined. Provided the target of process tt coincides with the source of 
process tt', it merges the maximal places of tt with the minimal places of tt' 
according to the orders and -<q. The parallel composition of two processes 
consists of taking their disjoint union and extending the orders on minimal and 
maximal places by taking a b whenever a belongs to the first process and 
b to the second. It can be shown that with these two operations the strongly 
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concatenable contextual processes of N form the arrows of a strict monoidal 
category SCCV{N). Symmetries can be defined by taking a process that contains 
just places (no transitions) with suitable orderings and Each place is 
both minimal and maximal. These symmetries make SCCV{N) be a symmetric 
monoidal category. Due to space limitation we cannot give more details here. 
We refer to CHI for the presentation of the category of strongly concatenable 
processes which is similar. We can now state the main result of the paper. 

Theorem 2. The category SCCV{N) is isomorphic (via a symmetric monoidal 
functor) to the full subcategory of B{N) whose objects are the elements of S'^ . 

The proof is quite long and requires the introduction and the description 
of the algebra of further kinds of processes that represent those arrows whose 
sources and targets involve nuclei and electrons. In particular, we use suitable 
inscriptions inside minimal and maximal places in order to represent the elec- 
trons which have been moved away from the nuclei of their atoms and their 
movements. However, such inscriptions are vacuous for processes whose source 
and target are strings of places (as all the electrons are next to their nuclei), and 
therefore we resort to strongly concatenable contextual process as in Theorem 0 

For example, let us consider again the net N in Figure 0 In B{N) we find 
the basic arrows tg: a® c” — >■ c_, t'(\c®a — >■ c_, t'-^\b®c — >■ c_, t'l\c®b — ?> c_ and 
t' 2 -c ^ 0, with tg = 7 ^ c“i ^0 ^1 = 7b c“! ^1 ■ ^he concurrent execution of to 

and ti with shared context can be written as (f(ic+ ® 7c~ a ® ® ® ^i)- 

This time it differs from tQ^idc^ ®t'[ which, having source and target not in S'®, 
does not correspond to any process. The concurrent execution of to and t\ with 
different contexts can be instead written as a = id^+ ® tg ® id^+ ® t'(, and the 
terms a; {idc ® t' 2 ) and a; (t '2 ® idc) denote different processes: in the former ti 
causes ^2 and in the latter to causes ^ 2 - 

Besides the fact that all the arrows of B{N) have a meaningful computa- 
tional interpretation, a further advantage of the present approach with respect 
to the match-share categories of ^ is that the arrows of the model category 
corresponding to pure concatenable process can be distinguished just by looking 
at their sources and targets, rather than by inspecting their construction. 

Concluding Remarks and Future Work 

Building on a illuminating suggestion of Meseguer in 0 , we have shown a way 
to extend the algebraic semantics of pt nets proposed in m to contextual 
nets, both in the collective token and the individual token interpretation. The 
constructions rely on the choice of a non- free monoid of objects, whose elements 
we called molecules and bimolecules. Furthermore, in treating the individual 
token philosophy, we have renounced to the symmetry of the monoidal category, 
being then able to select only the symmetries consistent with our computational 
interpretation in terms of strongly concatenable contextual processes. 

Although we have worked only at the level of single nets, we believe that 
our approach can be extended to constructions between categories of nets and 
models, with restrictions analogous to those well-known in the literature fTTITT^ . 



186 



R. Bruni and V. Sassone 



Acknowledgements. Thanks to Jose Meseguer and the referees for helpful suggestions. 



References 

1. P. Baldan, A. Corradini, and U. Montanari. An event structure semantics for P/T 
contextual nets: Asymmetric event structures. In Proc. FoSSaCS’98, vol. 1378 of 
Lect. Notes in Comput. Set., pp. 63-80. Springer, 1998. 

2. E. Best and R. Devillers. Sequential and concurrent behaviour in Petri net theory. 
Theoretical Computer Science, 55:87-136, 1987. 

3. R. Bruni, J. Meseguer, U. Montanari, and V. Sassone. A comparison of Petri net 
semantics under the collective token philosophy. In Proc. ASIAN’98, vol. 1538 of 
Lect. Notes in Comput. Sci., pp. 225-244. Springer, 1998. 

4. S. Christensen and N.D. Hansen. Coloured Petri nets extended with place capac- 
ities, test arcs and inhibitor arcs. In Applications and Theory of Petri Nets, vol. 
691 of Lect. Notes in Comput. Sci., pp. 186-205. Springer, 1993. 

5. P. Degano, J. Meseguer, and U. Montanari. Axiomatizing the algebra of net com- 
putations and processes. Acta Inform., 33(7):641-667, 1996. 

6. F. Gadducci and U. Montanari. Axioms for contextual net processes. In Proc. 
ICALP’98, vol. 1443 of Lect. Notes in Comput. Sci., pp. 296-308. Springer, 1998. 

7. U. Goltz and W. Reisig. The non-sequential behaviour of Petri nets. Inform, and 
Comput, 57:125-147, 1983. 

8. R. Janicki and M. Koutny. Semantics of inhibitor nets. Inform, and Comput., 
123:1-16, 1995. 

9. J. Meseguer. Rewriting logic as a semantic framework for concurrency: A progress 
report. In Proc. CONCUR’96, vol. 1119 of LNCS, pp. 331-372. Springer, 1996. 

10. J. Meseguer and U. Montanari. Petri nets are monoids. Inform, and Comput., 
88(2):105-155, 1990. 

11. J. Meseguer, U. Montanari, and V. Sassone. On the semantics of place/transition 
Petri nets. Mathematical Structures in Computer Science, 7:359-397, 1997. 

12. U. Montanari and F. Rossi. Contextual occurrence nets and concurrent constraint 
programming. In Graph Transformations in Computer Science, vol. 776 of Lect. 
Notes in Comput. Sci., pp. 280-285. Springer, 1994. 

13. U. Montanari and F. Rossi. Contextual nets. Acta Inform., 32:545-596, 1995. 

14. C.A. Petri. Kommunikation mit Automaten. Ph.D. thesis, Institut fiir Instru- 
mentelle Mathematik, Bonn, 1962. 

15. W. Reisig. Petri Nets: An Introduction. EACTS Monographs on Theoretical 
Computer Science. Springer, 1985. 

16. G. Ristori. Modelling Systems with Shared Resources via Petri Nets. Ph.D. thesis, 
Dipartimento di Informatica, Universita di Pisa, 1994. 

17. V. Sassone. An axiomatization of the algebra of Petri net concatenable processes. 
Theoretical Computer Science, 170:277-296, 1996. 

18. V. Sassone. An axiomatization of the category of Petri net computations. Mathe- 
matical Structures in Computer Science, 8:117-151, 1998. 

19. R.J. van Glabbeek and G.D. Plotkin. Gonfiguration structures. In Proc. LICS’95, 
pp. 199-209. IEEE Press, 1995. 

20. W. Vogler. Efficiency of asynchronous systems and read arcs in Petri nets. In Proc. 
ICALP’97, vol. 1256 of Lect. Notes in Comput. Sci., pp. 538-548. Springer, 1997. 

21. W. Vogler. Partial order semantics and read arcs. In Proc. MFCS’97, vol. 1295 of 
Lect. Notes in Comput. Sci., pp. 508-517. Springer, 1997. 

22. G. Winskel. Event structures. In Proc. of Advanced Course on Petri Nets, vol. 255 
of Lect. Notes in Comput. Sei., pp. 325-392. Springer, 1986. 




Asymptotically Optimal Bounds for OBDDs 
and the Solution of Some Basic OBDD Problems 

(Extended Abstract) 



Beate Bollig* and Ingo Wegener* 

FB Informatik, LS2, Univ. Dortmund, 44221 Dortmund, Germany 
bollig, wegener@ls2.cs.uni-dortmund.de 



Abstract. Ordered binary decision diagrams (OBDDs) are nowadays 
the most common dynamic data structure or representation type for 
Boolean functions. Among the many areas of application are verihca- 
tion, model checking, and computer aided design. For many functions it 
is easy to estimate the OBDD size but asymptotically optimal bounds 
are only known in simple situations. In this paper, methods for proving 
asymptotically optimal bounds are presented and applied to the solution 
of some basic problems concerning OBDDs. The largest size increase by 
a synthesis step of tt-OBDDs followed by an optimal reordering is de- 
termined as well as the largest ratio of the size of deterministic finite 
automata and quasi-reduced OBDDs compared to the size of OBDDs. 
Moreover, the worst case OBDD size of functions with a given number 
of 1-inputs is investigated. 



1 Introduction and Results 

Branching programs (BPs) are a well established representation type or compu- 
tation model for Boolean functions. Its size is tightly related to the nonuniform 
space complexity (see e.g. M)- Hence, one is interested in exponential lower 
bounds for more and more general types of BPs (for the latest breakthrough for 
semantic linear depth BPs see Q). In order to use variants of BPs as dynamic 
data structure one needs a BP variant such that a list of important operations 
(see e.g. m) can be performed efficiently. E.g., for verification, model checking, 
and a lot of CAD applications we need an efficient test whether a representa- 
tion has a satisfying input (satisfiability test) and an efficient test whether two 
representations describe the same function (equality test). These are NP-hard 
problems for general BPs. 

Bryant 0 has presented tt-OBDDs as a simple BP variant allowing effi- 
cient algorithms for all important operations. Although we now have efficient 
algorithms for more general and, therefore, more compact representation types, 
tt-OBDDs are used in most applications and the use of an OBDD package PI 
is nowadays a standard technique. 
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Definition 1. Let = {a;i, . . . ,x„} be a set of Boolean variables. A variable 
ordering tt on is a permutation on leading to the ordered list 

x,r(i) , ■ ■ ■ , x,r(n) of the Variables. 

Definition 2. A tt-OBDD on Xn is a directed acyclic graph G = {V,E) whose 
sinks are labeled by Boolean constants and whose non sink (or inner) nodes are 
labeled by Boolean variables from X„. Each inner node has two outgoing edges 
one labeled by 0 and the other by 1. The edges between inner nodes have to respect 
the variable ordering tt, i.e., if an edge leads from an Xi-node to an Xj-node, 
{xi precedes Xj in . . . , Each node v represents a 

Boolean function fv ■ {0, 1}" — >■ {0, 1} defined in the following way. In order 
to evaluate /„(a), a € {0,1}", start at v. After reaching an Xi-node choose the 
outgoing edge with label at until a sink is reached. The label of this sink defines 
fv{o). The size of the tt-OBDD G is equal to the number of its nodes. 

Bryant ^ has already shown that the minimal-size tt-OBDD for a function / 
is unique (up to isomorphism) and it is called the reduced tt-OBDD (or shortly 
the tt-OBDD) for /. Its size is described by the following structure theorem ^2|- 
In order to simplify the description we describe the theorem only for the special 
case where tt equals the identity id(i) = i. 

Theorem 1. The number of Xi-nodes of the id-OBDD for f is the number st 
of different subfunctions f\xi=ai,... ai, . . . ,ai_i G {0,1}, essentially 

depending on Xi {a function g depends essentially on Xi if g\xi=o 7^ 9\xi=i)- 

It is a simple corollary that the number s* of different subfunctions 
/|a:i=ai,... oi , . . . , 0^-1 G {0,1}, is a lower bound on the id-OBDD 

size of /. Obviously, [logs*] is the one-way deterministic communication com- 
plexity of / if Alice holds Xi, . . . ,Xi_i and Bob holds Xi, . . . ,x„. (See |H] and 
0 for the theory of communication complexity.) For non-constant functions the 
id-OBDD size of / equals si -I- • • • -I- s^ -I- 2. If many Si have asymptotically the 
same and the largest value, one-way communication complexity is not strong 
enough to obtain asymptotically optimal bounds. Moreover, we have the free- 
dom to choose an appropriate variable ordering for /. Let tt-OBDD(/) denote 
the tt-OBDD size of /. 

Definition 3. The OBDD size of f {denoted by OBDD{f)) is the minimum of 
allTT-OBDD{f). 

Using Theorem 1 or one-way communication complexity a lot of exponential 
lower bounds on the OBDD size of functions have been proved (see e.g. [i 1 . 
But there are only a few functions whose OBDD size is asymptotically known 
exactly. These are functions with linear OBDD size, symmetric functions, and 
a few more functions. We develop lower bound methods for asymptotically op- 
timal OBDD bounds and solve problems motivated from OBDD applications, 
automata theory, and complexity theory. 

In order to use OBDDs we have to transform a logical description of a func- 
tion, e.g. a circuit, into an OBDD representation. This is done by a sequence 
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of binary synthesis steps. A binary synthesis step computes a tt-OBDD Gh for 
h = f ®g (® is a binary Boolean operation like AND or EXOR) from tt-OBDDs 
Gf and Gg for / resp. g. Bryant ^ has shown how this can be done in time 
0{\Gf \ ■ |Gg|) and he has presented an example that the result may need size 
0(|G/| • |Gg|). His example has two drawbacks. The chosen variable ordering 
is bad for h, f, and g (and, therefore, such a synthesis step will not occur in 
applications) and the functions / and g depend essentially on disjoint sets of 
variables. It is not too hard to present an example without these drawbacks. 
But this is nevertheless not the answer to the question about the worst case 
for the binary synthesis problem. If a binary step leads to a tt-OBDD much 
larger than the given tt-OBDDs, all recent OBDD packages start to look for 
a better variable ordering. Although the search for an optimal OBDD variable 
ordering is NP-hard |2| and this holds even for the corresponding approxima- 
tion problems for arbitrary constant factors m heuristic algorithms like sifting 
PI often lead to very good results. Hence, the main step is a binary synthesis 
step followed by reordering. This leads to the problem whether it is possible 
that OBDD(h) = 0 (tt-OBDD(/) • tt-OBDD( 5 )) for functions / and g essen- 
tially depending on all considered variables. (Here we should mention the folk- 
lore result (see fS]) that OBDD(h) may be exponential even if OBDD(/) and 
OBDD(g) are linear but the linear size only is possible for different variable or- 
derings.) In Section 4, we solve the problem by representing an example where 
tt-OBDD(/„) = 0{n), tt-OBDD(( 7 „) = 0(n), and OBDD(h„) = 0(n^). This 
surely is the less surprising answer but the lower bound proof for the OBDD size 
of hn has some interesting features. 

Some applications 0 work with a restricted variant of tt-OBDDs which may 
be called leveled tt-OBDDs or quasi-TT-OBDDs (tt-QOBDDs). 

Definition 4. A tt-QOBDD is a tt-OBDD with the additional property that each 
edge leaving an XT^(^iynode, i < n, reaches an a; 7 r(i+i)-node. 

We are interested in QOBDDs also because of their tight relationship to 
deterministic finite automata (DFAs) for so-called Boolean languages L where 
L C {0,1}" for some n. It is an easy exercise to verify for Lf = /“^(l) that 
DFA(L/) < id-QOBDD(/) < DFA(L/) -|- n. Hence, id-QOBDDs and DFAs are 
almost the same. For general regular languages consisting of strings of different 
lengths it makes no sense to discuss different “variable orderings” or permu- 
tations of the input string. For Boolean languages, a tt-DFA may apply the 
reordering tt to all inputs of length n. The above inequality can be generalized 
to tt-DFA(L/) < tt-QOBDD(/) < Tr-DFA(Ly) -|- n. Moreover, it is obvious that 
tt-QOBDD(/) < (n -I- 1) • tt-OBDD(/) and this bound is tight for the constant 
functions (syntactically depending on n variables). It is also not too difficult 
to see that tt-QOBDD(/) = 0{n ■ tt-OBDD(/)) for some function / essentially 
depending on all n variables. 

Definition 5. The multiplexer MUXn {or direct storage access function DSAn) 
is defined on n k variables Ok-i, ■ ■ ■ ,ao,xo, ■ ■ ■ ,Xn-i where n = 2^. 
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MUXn{a,x) = x\a\ where |a| is the number whose binary representation equals 

(o/c-i, . . . ,Oo). 

Let 7T be the variable ordering , Oq, Xq, . . . , Then 

7t-OBDD(MUX„) = OBDD(MUX„) = 2n + 1. The tt-OBDD starts with a 
complete binary tree on the a- variables. For the path where |a| = i it is suf- 
ficient to test Xi- For the tt-OBDD we need i extra nodes before the a:i-node 
and n — 1 extra nodes before each of the sinks. Hence, 7r-QOBDD(MUX„) = 
-I- — 1 = 0{n ■ tt-OBDD (MUX„)). But in order to compare the size of 

OBDDs and QOBDDs (and DFAs for Boolean languages with reordering) we ask 
whether QOBDD(/„) = 0(n-OBDD(/„)) for functions /„ essentially depending 
on n variables. This is the question whether the possibility of OBDDs to omit 
the test of variables may save a size factor of 0{n). 

Since it is the main rule of thumb for the variable ordering problem to test 
control or address variables (like the a- variables of MUX„) before the data vari- 
ables (like the x-variables of MUX„), it was a well-established conjecture that 
the considered variable ordering tt is optimal for QOBDDs for MUX„ and that 
QOBDD(MUXji) = 0{n^). In Section 2, we prove the surprising result that 
QOBDD(MUXji) = 0(n^/logn). Section 3 provides a function /„ essentially 
depending on n variables such that QOBDD(/„) = 0{n ■ OBDD(/„)) proving 
that the freedom of OBDDs to omit tests may indeed decrease the size by a 
factor of 0{n). 

In Section 5, we investigate the dependence of the OBDD size on the size of 
/“^(I). Let N{a{n)) be the number of Boolean functions / where |/^^(1)| < 
a(n). The standard counting argument proves the existence of functions /„ 
where |/,7^(1)| < a(n) such that its OBDD size and even its circuit size is 
J7(logiV(a(n))/loglogIV(a(n))). On the other hand, obviously OBDD(/„) < 
0{na{n)) for these functions. For a{n) = 2", the lower bound of size 2"/n is 
optimal (see m)- For a{n) = 1, the upper bound of size n is optimal. The 
question is how large can we choose a(n) such that we can define functions /„ 
where |/,7^(1)| < a{n) and OBDD(/„) = 0{na{n)). We describe such functions 
for polynomially increasing a(n). 



2 QOBDDs and DFAs for the Multiplexer 

In this section, we determine the size of QOBDDs and DFAs with reordering for 
the representation of the multiplexer. 

Theorem 2. QOBDD{MUX„) = 0(n^/logn). 

Sketch of proof. First, we prove for some variable ordering tt that the 
tt-QOBDD size of MUX„ is 0{n^ /logn). Let m := k — [logfcj -|- 1. The vari- 
able ordering tt is given by Ok-i, ■ ■ ■ ,ak-m,xo, . . . ,Xn-i,ak-m-i, ■ ■ ■ ,a.Q. The 
ar-variables are partitioned to 2™ groups such that the indices of the vari- 
ables of each group agree in their binary representation in the m most sig- 
nificant bits. The size of each group is njT^ . We start with a complete bi- 
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nary tree of the first m a-variables. The tree has 2"* leaves and 2"* — 1 in- 
ner nodes. Then we test the a;- variables of group Gq. Only the subfunction 
where Ofc-i = • • • = ak-m = 0 essentially depends on these variables. For all 
other groups we need a^i-nodes, since we consider QOBDDs. Hence, we need 
one complete binary tree with 2"/^ leaves and 2"/^ — 1 inner nodes and 

(2™ — l)n/2™ < n further nodes which could be eliminated in OBDDs. One leaf 
can be replaced by the 0-sink. The same arguments work for the next group. 
Here the width for the first group is 2’^/^™ and the total width is bounded by 

2 • 2"/^ -I- 2™ — 2. The crucial argument is the following one. We can merge the 

2*^/2 nodes for the case {uk-i, ■ ■ ■ , ak-m) = (0, . . . , 0, 0) with the 2”/^ nodes 
for the case {ak-i, ■ ■ ■ , ak-m) = (0, . . . , 0, 1). We only have to store the data 
vector namely the a;-vector. The result only depends on the further address bits. 
The width after the tests of the a;- variables of group G 2 ^-i equals 2”/^™ — 1. 
The size of the last k — m levels is bounded by 2^ ™ . The total size is bounded 
above by 2™ - 1 -k 22''“'" -k n(2 • 2"/^'" -k 2™ - 2) < n2™ -k 2n2”/2’" -k 22'““'". 
By the choice of m, 2n/logn < 2™ < 4n/logn, and we obtain an upper bound 
of 0(^2/ log n). For the lower bound for arbitrary variable orderings tt it is suf- 
ficient to prove a lower bound of size l7(n/logn) on the size of G{n) a;-levels 
of tt-QOBDDs. By standard lower bound techniques it can be shown that the 
a^j-level of a tt-QOBDD representing MUX„ has a size of f2(n/logn) if Xi be- 
longs to the second quarter of all a;-variables with respect to tt (for more details 
see P]). □ 

This result implies by the discussion in Section 1 the same result for DFAs 
with reordering. We only want to mention here that we can get similar results 
for so-called zero-supressed BDDs (ZBDDs) which are used in many applications 
(see e.g. 0). This is the first example of a function (moreover, a ’’natural” 
function) and of BDD models of practical relevance that the rule of thumb 
’’control variables before data variables” is wrong. 

3 The Maximal Size Gap between OBDDs, QOBDDs 
and DFAs 

We look for functions /at essentially depending on all their N variables such that 
the size gap of OBDDs on one hand and QOBDDs and DFAs with reordering 
on the other hand is a factor of 0{N) which is by the discussion in Section 1 
the largest possible gap. For such a function it is necessary that many edges 
in the optimal OBDD omit many tests. Therefore, the multiplexer has been 
considered as a good candidate. But the result of Section 2 implies that the 
multiplexer only leads to a gap of 0{N / log N). The multiplexer has many more 
data variables than address variables. We can prove the largest possible gap for 
a function /at on N = n? -\- 2n variables, among them ‘n? control or address vari- 
ables (here called selection variables) sq, . . . , and only 2n data variables 

XQ,ijQ, . . . ,Xn-i,yn-i which lead to data pairs Xiyj, 0 < i,j < n — 1. The 
data pairs are partitioned to n blocks Bm, 0 < m < n— 1, such that Bm contains 
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all pairs Xiyi+m (the indices are computed mod n). We consider the following 
ordering pp, . . . of the pairs. The pair where k = in + j equals XjUi+j, 

i.e., we start with the pairs from Bq, followed by the pairs from B\, . . . In each 
block we start with the pair containing xq, followed by the pair containing Xi, 
and so on. The main property is that the distance between two pairs containing 
Xi equals n and the distance between two pairs containing pi is at least n—1. Fi- 
nally, we define fjv by fjv(s, x, y) = Vo<fe<n 2 _i so • ■ • Sk-i^Pk, he., the s-vector 
selects with its first zero which pair has to be evaluated. 

Theorems. 0BDD{fN) = 9{N) while Q0BDD{fM) = 9{N'^) and DFA{fN) = 

e{N^). 

Sketch of proof. It is obvious that the OBDD size of /jv for the “natural” vari- 
able ordering sp, . . . , xp, . . . , Xn-i,yo, ■ ■ ■ , Pn-i equals 2n^-|-n-|-2 = 0{N) 

and that Jm essentially depends on all its variables. This implies OBDD(/jv) = 
0{N) and the upper bounds for the other models. 

In the following we present a sketch of the proof for the lower bound on the 
QOBDD size (for more details see 0). This implies the lower bound for DFAs 
with reordering. We fix an arbitrary variable ordering tt. There are n^/24 levels 
where a selection variable is tested and the number of already tested selection 
variables is at least n^/8 and less than n^/6. It is sufficient to prove that each 
of these levels has a size of J7(n^). 

We fix one of the described levels and use the following notation. The sets 
T(x), T(p), and T{s) contain the a;-, y-, and s-variables, resp., which are tested 
before the considered levels. The sets R{x), R{y), and R{s) contain the corre- 
sponding remaining variables. Let T{x, y) := T(x)UT{y). We distinguish whether 
the size of T(x,y) is small, large, or medium. 

Case 1: \T{x,y)\ < (small size). 

Case 2: There are at least 21ogn variables in T{x) (or T{y)) such that for 
each of these variables Xi there is a pair pk = Xiyj where Sk £ R(s) (large size). 

Remark: This case is called “large size”, since one of the conditions |T(a;)| > 

-I- 21ogn and \T{y)\ >\n + 21ogn is sufficient for Case 2. 

Case 3: Not Case 1 or Case 2 (medium size). 

If \T{x, y) I is small, we can argue similarly to the case of the natural variable 
ordering. If it is large, we have to store too much information on the data vari- 
ables like in the case of the multiplexer. The most difficult case is the case where 
T{x,y) is of medium size. We assume that |T(x)| > (the other case can be 
handled similarly). There is a subset T'{x) of T{x) of size — 21ogn > 

(for large n) such that Sk £ T{s) for all pairs pk = XiPj where Xi £ T'{x). The 
condition |T(p)| < + 21ogn < (for large n) implies for each Xi £ T'{x) 

the existence of at least |n pairs pk = xiyj such that Sk £ T{s) and yj £ R{y). 

We have partitioned the set of all pairs po, ■ ■ ■ ,Pn^-i into n blocks Bq, . . . , 
Rji—l such that Bi — {Pm, ■ • ■ ,Pm-t-n— l}- 

Claim 1. There are good blocks, i.e., blocks each containing at least 
of the chosen pairs. 
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Now we investigate the set P of the chosen pairs pk belonging to the |n 

good blocks. For each such pair pf. = XiPj we know that Sk € T{s), Xi € T(x), 
and pj G R{y)- Moreover, we define a subfunction of Jn by assigning the 
following values to the tested variables. We assign the value 0 to Sk and the 
value 1 to all other variables in T{s). We assign the value 1 to Xi and the value 0 
to all other variables in T{x, y). The function has a prime implicant consisting 
of yj and perhaps some s- variables. It is possible that gk = 9 h k ^ I, but then 
we have some implications on the set T(s). 

Claim 2. If k < I and gk = gi, then Sk+i, ■ ■ ■ , s;_i G T{s) and the pairs pk and 
Pi contain the same y-variable. 

Since each variable is contained in exactly one pair in each block, gk = gi 
implies that pk and pi belong to different blocks. 

Claim 3. There are less than blocks containing some pk G P such that 
9k = 9i for some pi € P and I ^ k. 

^From Claim 1 and Claim 3 we conclude that there are at least (g ~ = 

blocks each containing pairs such that all the corresponding 

subfunctions gk are different. This implies the lower bound C(n^) on the size of 
the level. □ 

With similar arguments we can prove that ZBDD(/at) = 



4 The Maximal Size Increase of a tt-OBDD Synthesis 
Step with Optimal Reordering 

In this section, we prove that the synthesis of tt-OBDDs essentially depend- 
ing on the same set of variables can lead to a multiplicative size increase (like 
the well-known result for DFAs) and that this result even holds if the syn- 
thesis can be followed by an optimal reordering. The functions are defined 
on n + 2k variables xq, , Xk-i,yo, ■ ■ ■ , Vk-i, zq, ■ ■ ■ , Zn-i where n = 2^ . Let 
fn{x, y, z) = MUX„(a;, z) and gn(x, y, z) = MUX„(y, z). These functions do not 
depend essentially on all variables. Nevertheless, we first investigate /„, (/„, and 
hn = /n ©5n- Afterwards, we define modified functions /*, p* , and = f*(B Pn 
depending essentially on all variables and having similar properties. The function 
hn is defined by 



hn{x,y,z)= \f (\x\=i) A{\y\= j) A{zi® Zj). 

0<i,j<n—l 

Theorem 4. Let tt* be the variable ordering xq, ■ . . , Xk-i,yo, ■ ■ ■ , Pk-i, 
zo,. . . ,z„-i. Then tt* -OBDD{ fn) = tt* -OBDD{ gn) =2n+l but OBDD(hn) = 
C(n^), i.e., a synthesis step followed by optimal reordering can lead to a multi- 
plicative size increase. 
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Fig. la) A macroscopic view 



b) A microscopic view 



We prove a lower bound of size I7(n^) by proving that each of f2{n) levels 
has size I7(n). The interesting aspect is that there is not necessarily a block of 
I7(n) levels each of size I7(n). It may happen that small levels and large levels 
occur in a rather irregular order. Nevertheless, we are able to bound the number 
of small levels in a sufficient way. This is the first proof of an asymptotically 
optimal OBDD lower bound in such a situation. 

Sketch of proof of Theorem 4. The upper bounds are obvious. For the lower 
bound proof we fix an arbitrary variable ordering tt. First, we visualize the situ- 
ation after the test of some variables. We do not use the communication matrix, 
since we believe that a different representation better supports the counting of 
different subfunctions essentially depending on some specific z-variable. Again 
we use the notation T{x), T{y), and T{z) for the sets of already tested variables. 
Let r := T{x) and c := T{y). Then we have 2’’ partial cc-addresses which parti- 
tion the set of all z- variables into 2’’ blocks of size n2“’' each. Two z- variables 
Zi and Zj belong to the same block if the binary representations of i and j agree 
in the positions belonging to variables in T{x). In the same way we obtain 2^ 
blocks of size ri2~'^ each corresponding to the variables in T{y). We consider 
the following n x n-matrix. The rows correspond to the z-variables and they are 
ordered blockwise with respect to the 2'’ row blocks. In each block we order the 
variables according to the canonical ordering with respect to the vector describ- 
ing the value of the cc- variables which have not been tested yet. The columns 
also correspond to the z-variables and they are ordered blockwise with respect to 
the 2° column blocks. The entry at the Zj-row and the Zj-column equals Zj © Zj. 
Our aim is to prove a lower bound on the size of the z^-level of the tt-OBDD 
representing For this reason it is sufficient to investigate those 2’’ + 2'^ — 1 
assignments to the variables in T(a;, y) where at least one of the partial addresses 
allows the value i. In Fig. la) the corresponding blocks are shaded. Our aim is to 
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count the number of different subfunctions essentially depending on Zi. First, we 
consider the subfunctions for some fixed assignment to the variables in T{x,y). 
This leads to a submatrix of the matrix considered above (see Fig. lb)). In our 
example the submatrix contains the Zi-vo'w but not the z^-column. It is called a 
Zi-row-rectangle. Variables Zj € T(z) are replaced by aj. This Zj-row-rectangle 
is a description of the considered subfunction of /i„. Let s be the number of 
different variables from T(z) which correspond to a column or row of the consid- 
ered submatrix. Then we obtain exactly 2^ different subfunctions all essentially 
depending on Zi. Similar results hold for Zi-column-rectangles. 

Is it possible to obtain the same subfunction for different assignments to the 
variables from T(x, y) ? This happens iff the a-entries are replaced in such a way 
by constants that two Zj-rectangles are equal. We assume that r < fc or c < fc (if 
r = c = fc, all address variables have been tested which leads to an easy subcase). 
A Zi-row-rectangle Ri differs from a 2 j-column-rect angle Ci, since Ri contains 
entries essentially depending on Zi exactly in the 2 ;i-row while this happens in Ci 
exactly in the z^-column. Now we consider w.l.o.g. two Zj-row-rectangles Ri and 
R'i- If Ri contains a Zm-column, it contains a column where all entries essentially 
depend on Zm- This cannot happen in i?' where at most one row can depend 
essentially on Zm- Hence, two different z^-row-rectangles agree iff all z- variables 
corresponding to the columns have already been tested and the corresponding 
vectors are equal. The only-if-part follows from the consideration of the case 
|a;| = i. The if-part follows, since the remaining assignments to a;-variables and 
z- variables belonging to rows of the block have the same influence on Ri and i?'. 

Summarizing we can conclude that we are able to determine the size of each 
z-level of a tt-OBDD representing /i„. We still have to prove that C{n) z-levels 
have size Q{n). The first and last z-levels can be very small. We concentrate on 
the levels where the z- variables at the positions n/2-|- 1, . . . , 3n/4 of the variable 
ordering on the z-variables are tested. 

Case 1. There is no block such that all corresponding row variables have 
already been tested. (The same arguments work for the column variables.) 

We consider the z^-column-rectangles. Let Vj be the number of T(z)-variables 
belonging to the jth row block, 1 < j < 2'’. The sum of all Vj is at least n/2 
(by the choice of the considered levels) and our lower bound arguments lead to 
the lower bound considered Zi-level. This lower 

bound is minimal if the sum of all rj is equal to n/2 and all rj are equal to ^ /2'’. 
This leads to the lower bound 2’’2"/^ ^ . As long as n/2’’+^ > 2, this exponent 
decreases at least by 1 if r is increased by 1. In these cases the lower bound is 
decreasing with r. This happens as long as r -I- 2 < log n. Hence, we obtain an 
I7(n) bound on the size of the Zi-level. 

Case 2. Not Case 1 and r < logn — log log n (or c < logn — log log n). 

There is a Zj-column-rect angle where all n2~^ z-variables corresponding to 
the rows of the rectangle have already been tested. This leads to a lower bound 
of size 2"^ > n. 

Case 3. r > logn — c* (or c > logn — c*) for some constant c* (c* = 8 is 
appropriate in this proof). 
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We have 2'' Zj-column-rectangles and at least n/4 z- variables which have not 
been tested. Each row block contains n2~^ z-variables. Hence, there are at least 
i-2’’ = J7(n) Zi-column-rect angles such that at least one row variable has not 
been tested. For each of these rectangles we obtain a lower bound of size 1 and 
we have shown above that the different Zi-column-rectangles cannot agree and 
describe different subfunctions. Altogether, we obtain also in this case a lower 
bound of size I7(n). 

We obtain the proposed I7(n^) lower bound if I7(n) of the n/4 considered 
levels belong to one of the three cases. Hence, we only have to consider the 
situation where n/4 — o(n) of the considered levels do not fulfil the assumptions 
of one of the three cases. We assume w.l.o.g. that on n/8 — o(n) of these levels 
the condition c <r holds. On these levels, log n — log log n < c < r < log n — c*. 
Only to simplify the notation we assume that N = log log n is an integer. We 
consider the levels where r = log n — log log n + t has a fixed value. In particular, 
1 < t < log log n — c*. We have 2'' = 2‘n/logn -column-rectangles. Since at 
least n/4 z-variables have not been tested, there are at least 2’'“^ Zj-column- 
rectangles such that some corresponding row variable has not been tested. Let 
m be the number of already tested z-variables belonging to the column block 
containing Zi. This leads to the lower bound ^2’’2’”. If m > log log n — t, the 
lower bound is of size D{n). We have 2“ column blocks of size n2“° each. Hence, 
there are at most 2'^ (log log n — t) < 2'’ (log log n — t) bad z-levels where we have 
not proved an f2(n) bound. Let n > 4. Then the number of bad levels can be 
estimated by Y.i<t<N-c- + 2)2-=* 2^ = (c* -k 2)2"=* n. 

For c* = 8 these are at most ^n bad levels out of |n — o(n) levels. Hence, also 
in this situation we have proved the existence of f?(n) levels whose size is f2(n). 
This implies the proposed I7(n^) bound. □ 

In a last step, we have to generalize our results to functions essentially depending 
on all their variables (for details see |3|). 

5 On the Maximal OBDD Size with Respect 
to the Number of 1-Inputs 

For the construction of a function fn,k with |/^^(1)| = (^) and OBDD(/„_fe) = 
0(n|/-^(l)|) we use the construction of Kovari, Sos, and Turan jS| for the solu- 
tion of the well-known problem of Zarankiewicz. Their result can be explained as 
follows. Let n = for some odd prime p. Let U := {0, . . . , n— 1} be the universe 
which is partitioned to p blocks Bq, . . . , where Bi = {ip, . . . ,ip + p — 1}. 
Then it is possible to define explicitly sets Aq, , A„_i with the following prop- 
erties: 

- \Ai\ = p for all i, 

- I Ai n Aj I < 1 for all i ^ j, 

- \Ai n Bj\ = 1 for all i and j, 

- for all i S Bk and j S Bi where k yk I there exists some m such that i,jG Am- 

For the definition of fn,k we consider for each choice of 0 < < Z 2 < • • • < < n 
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the set defined as the union of all 1 < j < k, and the correspond- 
ing minterm on {ccq, . • . ,x„_i} which computes 1 iff = 1 exactly for 

all i G The function fn^k computes 1 iff one of the minterms 

computes 1. 

Theorem 5. Let k he a eonstant. Then |/^^(1)| < (^) and OBDD{Jn k) = 

e(„0). 

Proof. By definition, \f~l(l)\ < (I)- (For large n, even \f~l(l)\ = (I)-) This 
implies the upper bound n(^) -I- 2 on the OBDD size of fn,k, since at most (^) 
of all assignments to some set of variables can be different from the constant 0. 

For the lower bound proof we fix a variable ordering tt and investigate the 
set P of all 1-paths namely all computation paths y*. corresponding to the 
minterms The proof strategy is the following one. We identify a set 

P' C P such that two different paths from P' have been split before or at level 
i.e., there is a node where one path chooses the 0-edge and the other one 
chooses the 1-edge. Afterwards, we identify a subset of P' such that two paths 
from this subset cannot share a node on the levels |n, . . . , |n. We ensure that 
the size of this subset is (2{N) for N = (^) . This proves the lower bound. 

First we remark that L\ Ai\ < k, if i ^ {ii,- ■ ■ Ak}- This has the 

following consequences. Since (for large n) p > k + 2, the inputs from f~k{l) 
have a Hamming distance of at least 2 and each 1-path contains n inner nodes. 
Moreover, an input a' which is the characteristic vector of A! such that |A'nAj| > 
k + l and Ai % A! has the property that fn,k{a') = 0. 

As next step, we prove that many 1-paths split early. Let I contain the 
indices of the first variables according to tt. The average size of all Bi (1 1 
equals and (for large n) there are two different blocks Bi and Bj such that 
\Bir\I\ > jp and \BjC\I\ > \p. There are at least = Q{N) (remember 

that /c is a constant) possibilities to choose k elements ii, ■ . . ,ik from BiDl and 
k elements ji, ■ ■ ■ ,jk from Bj n I. We identify each such choice with a unique 
minterm. The pair (ir,jr) determines by the properties of the A-sets uniquely 
a set Am,, such that v,jV- G A^,,. Since Ami,...,mfc A Bi = {ii,... ,ik} and 
Ami.... FI Bj = {ji, ■ • ■ ,jk}, different choices lead to different 1-paths. Let P' 
be the set of the chosen L2{N) 1-paths. Let us consider two different of these 
1-paths or minterms. They correspond to the choices zi,... ,ik,ji,--- ,jk and 
i'l, . . . ,Zfc, Ji, ... ,fk and w.l.o.g. i\ ^ {z^, . . . ,i'k}- The variable is tested on 
one of the first |n levels and the first minterm chooses a 1-edge on this level and 
the second one a 0-edge. Hence, the paths from P' split before or at level |rz. 

As final step, we prove that many 1-paths from P' do not merge again before 
level |n. Let I* contain the indices of the first |rz variables according to tt. Let 
r be the number of rich sets Ai, i.e., sets where \Ai fl J*| > p — k. We prove 
by contradiction that r < p — k — 1. We assume that \Ai^ fl /*| > p — k, . . . , 
Fl/*| >p-k. Since |Aij nAiJ < 1, |(Ajj UAiJn/*| > (p-k) + {p-k-l). 
In the same way, we conclude (for large n) that \{Ai^ U • • • U Ai^_,^) fl I*| > 
{p — k) + {p — k — 1) -\ -I- 1 > \{p— kY > fzz in contradiction to |/*| = |rz. 

Let P" C P be the set of 1-paths corresponding to sets Aij^,,.yj, such that 
all Ai^ are poor, i.e., not rich. The number of rich A-sets has been shown to be 
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at most p — k — 1. Hence, we have more than n — p poor sets and more than 
sets consisting of poor sets only. Since n — p = n — o{n) and fc is a 

constant, \P”\ > = N — o{N). P' and P” are subsets of P where |P| = N, 

\P'\ = f2{N), and \P"\ = N — o{N). Hence, |P'nP"| = f2{N). We consider two 
different paths pi and p 2 from P' DP”. They have split before or at level ^n. We 
assume w.l.o.g. that pi and p 2 correspond to and and split on 

the Xi-level, i € I, where i G (w.l.o.g. i € Ai^) and i ^ Now 

we assume that pi and P 2 share the node v on one of the levels between |n and 
|n. Then the path p* following p 2 from the source to v and pi from v also is a 
1-path corresponding to an input a' which is the characteristic vector of some 
set A' . Since Ai^ is poor, at least fc-l-1 variables Xr,r G Ai^, are tested positively 
on p*, namely on that part of p\ which starts at v. Hence, \A' fl Ai^ \ > fc -I- 1. 
Since Xi is tested negatively on p*, namely on that part of p 2 which stops at v, 
Ai^ 2 &nd fn,k{oI) = 0 (as shown above) in contradiction to the construction 
of p* as 1-path. This proves Theorem 5. □ 
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Abstract. While deterministic finite automata seem to be well under- 
stood, surprisingly many important problems concerning nondetermin- 
istic hnite automata (nfa’s) remain open. One such problem area is the 
study of different measures of nondeterminism in hnite automata. Our 
results are: 

1. There is an exponential gap in the number of states between unam- 
biguous nfa’s and general nfa’s. Moreover, deterministic communi- 
cation complexity provides lower bounds on the size of unambiguous 
nfa’s. 

2. For an nfa A we consider the complexity measures advice A{ri) as the 
number of advice bits, ambigA{n) as the number of accepting com- 
putations, and leafA{n) as the number of computations for worst 
case inputs of size n. These measures are correlated as follows (as- 
suming that the nfa A has at most one “terminally rejecting” state): 
advice A{n) , ambigA{n) < leafA{n) < 0{adviceA{n) ■ ambigA(n)). 

3. leafA{n) is always either a constant, between linear and polynomial 
in n, or exponential in n. 

4. There is a language for which there is an exponential size gap be- 
tween nfa’s with exponential leaf number/ambiguity and nfa’s with 
polynomial leaf number/ambiguity. There also is a family of lan- 
guages K 0 N ^2 such that there is an exponential size gap between 
nfa’s with polynomial leaf number/ambiguity and nfa’s with ambi- 
guity m. 

Keywords: hnite automata, nondeterminism, limited ambiguity, de- 
scriptional complexity, communication complexity. 



1 Introduction 

In this paper the classical models of one-way finite automata (dfa’s) and their 
nondeterministic counterparts (nfa’s) are investigated. While the structure and 
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fundamental properties of dfa’s are well understood, this is not the case for nfa’s. 
For instance, we have efficient algorithms for constructing minimal dfa’s, but the 
complexity of approximating the size of a minimal nfa is still unresolved (whereas 
finding a minimal nfa solves a PSPACE complete problem) . Hromkovic, Seibert 
and Wilke IHSWfll proved that the gap between the length of regular expres- 
sions and the number of edges of corresponding nfa’s is between n log^ n and 
nlogn, but the exact gap is unknown. Another principal open question is to 
determine whether there is an exponential gap between two-way deterministic 
finite automata and two-way nondeterministic ones. The last partially success- 
ful attack on this problem was done in the late seventies by Sipser who 

established an exponential gap between determinism and nondeterminism for 
so-called sweeping automata (the property of sweeping is essential |MS()j ). 

Our main goal is to contribute to a better understanding of the power of non- 
determinism in finite automata PEZH. We focus on the following problems: 

1. The best known method for proving lower bounds on the size of minimal nfa’s 
is based on nondeterministic communication complexity |H97j . All other 
known methods are special cases of this method. Are there methods that 
provide better lower bounds at least for some languages? How can one prove 
lower bounds on the size of unambiguous nfa’s (unfa’s), that is nfa’s which 
have at most one accepting computation for every word? 

2. It is a well known fact that there is an exponential gap between the sizes 
of minimal dfa’s and nfa’s for some regular languages. This is even known 
for dfa’s and unfa’s. But, it is open whether there exists an exponential gap 
between unfa’s and nfa’s, i.e., whether unambiguousness is a real restriction 
on nondeterminism. 

3. The degree of nondeterminism is measured in the literature in three different 
ways. Let A be an nfa. The first measure advice A{n) equals the number of 
advice bits for inputs of length n, i.e., the maximum number of nondeter- 
ministic guesses in computations for inputs of length n. The second measure 
leafA{n) determines the maximum number of computations for inputs of 
length n. ambigAiji) as the third measure equals the maximum number of 
accepting computations for inputs of length at most n. Obviously the second 
and third measure may be exponential in the first. The question is whether 
the measures are correlated. 

To attack these problems we establish some new bridges between automata the- 
ory and communication complexity. This approach leads to contributions in the 
study of the tradeoff between the size and the degree of nondeterminism of nfa’s. 
Communication complexity theory contains deep results about the nature of non- 
determinism (see e.g., [KlNSWh^ and (HS9Bj ) using the combinatorial structure 
of the communication matrix. These results can be applied to finite automata. 
Our main contributions are as follows: 

1. Let cc{L) resp. ncc{L) denote the deterministic resp. nondeterministic com- 
munication complexity of L. First we show that there are regular languages L 
for which there is an exponential gap between and the minimal size of 
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nfa’s for L. This means, that the lower bound method based on communica- 
tion complexity may be very weak. Then we show as a somewhat surprising 
result that — 2 is a lower bound on the size of nfa’s with ambiguity 

k for L. We furthermore show that rank{M)^^^ — 1 is a lower bound for the 
number of states for nfa’s with ambiguity k, where M is a communication 
matrix associated with L. It is possible that this lower bound is always bet- 
ter than the first, see [Mt] for a discussion of the quality of the so-called 
rank lower bound on communication complexity. 

As a corollary we show that there is a sequence of regular languages NIDm 
such that the size of a minimal nfa is linear in m, while the size of every 
unfa for NID^ is exponential in m. 

2. We establish (assuming that the nfa A has at most one “terminally rejecting” 
state) the relation 

advice A{n),ambig{n) A < leafA{n) < 0{adviceA{n) ■ ambigA{n)). 

Furthermore we show that leafA{n) is always either a constant, between 
linear and polynomial, or at least exponential in the input length. 

3. We show that there is a regular language such that there is an exponential 
gap between the size of nfa’s with exponential ambiguity, and nfa’s with 
polynomial ambiguity. This result is obtained by showing that small nfa’s 
with polynomial ambiguity for the Kleene closure (T#)* imply small unfa’s 
that work correctly on a polynomial fraction of inputs. 

Furthermore we describe a sequence of languages KON^-z such that there is 
an exponential size gap between nfa’s with polynomial ambiguity and nfa’s 
with ambiguity m. KON^ is a candidate for proving a size gap between nfa’s 
with polynomial ambiguity and nfa’s with arbitrary constant ambiguity, even 
when this ambiguity is larger than the optimal nfa size m. 

This paper is organized as follows. In section 2 we give the basic definitions and 
fix the notation. Section 3 is devoted to the investigation of the relation between 
the size of nfa’s and communication complexity, and it includes the results of 1). 
Section 4 is devoted to the study of the relation between different measures of 
nondeterminism in finite automata, and presents the remaining results. 

2 Definitions and Preliminaries 

We consider the standard models of (one-way) finite automata (dfa’s) and (one- 
way) nondeterministic finite automata (nfa’s). For every automaton A, L{A) 
denotes the language accepted by A. The number of states of A is called the size 
of A and denoted size a- For every regular language L we denote the size of the 
minimal dfa for L by s{L) and the size of minimal nfa’s accepting L by ns{L). 
For any nfa A and any input x we use the computation tree Ta,x to represent all 
computations of A on x. Obviously the number of leaves of Ta,x is the number 
of different computations of A on x. 
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The ambiguity of an nfa A on input x is the number of accepting computations 
of A on X, i.e., the number of accepting leaves of Ta,x- If the nfa A has ambiguity 
one for all inputs, then A is called an unambiguous nfa (unfa) and uns{L) denotes 
the size of a minimal unfa accepting L. More generally, if an nfa A has ambiguity 
at most k for all inputs then A is called a A:-ambiguous nfa and nsk{L) denotes 
the size of a minimal fc-ambiguous nfa accepting L. 

For every nfa A we measure the degree of nondeterminism in the following ways. 
Let S denote the alphabet of A. For every input x G S* and for every computa- 
tion C of A on a: we define advice{C) as the number of nondeterministic choices 
during the computation (7, i.e., the number of nodes on the path of C in Ta,x 
which have more than one successor. Then 

advice a{x) = max{adume(C')|C is a computation of A on x} 

and advice A{n) = Tiiax.{advice{x)\x € Tf"}. 

For every x € H* we define leafA^x) as the number of leaves of Ta,x and set 

leafA{n) = max{/ea/(a;)|a: G 77"}. 

For every x G 77* we define ambigA{x) as the number of accepting leaves of Ta,x 
and set 



ambigA{n) = max{ambig{x)\x G 77-"}. 

Since a language need not contain words of all lengths we define ambiguity over 
all words of length at most n which makes the measure monotone. Observe that 
the leaf and advice measures are monotone as well. 

Note that different definitions have been used by other authors, see 
e.g. EEM, mm, where the number of advice bits is maximized over 
all inputs and minimized over all accepting computations on those inputs. In 
this case there are nfa’s which use more than constant but less than linear (in 
the input length) advice bits, but this behavior is not known to be possible for 
minimal nfa’s. 

To prove lower bounds on the size of finite automata we shall use two-party 
communication complexity. This widely studied measure was introduced by Yao 
and is the subject of two monographs lEHTl, |ESnZ|. Let L C X X Y be a 
(possibly infinite) language. A two-party communication protocol works on two 
computers Cj and Cjj with unbounded computational power and a communi- 
cation link between them. At the beginning of the computation Cj receives an 
input X € X, and Cu an input y € Y. Then the computers exchange binary 
messages according to a fixed protocol until one of them knows whether xy G L. 
cc{L) resp. ncc{L) is the minimum over all deterministic resp. nondeterminis 
tic protocols of the worst case number of bits exchanged for any input. Note 
that for X, Y = S* this definition yields the uniform version of communication 
complexity [D HRS 9 7] , for X, Y = {0,1}" the standard version. Unless stated 
otherwise we use X, Y = X* . 



Measures of Nondeterminism in Finite Automata 



203 



A one-way protocol P is a protocol, in which only one message is sent from (7/ 
to C/ 7 , who is then able to compute the answer. The one-way communication 
complexity of L is the communication complexity of the best one-way protocol for 
L. The message complexity mc{P) for a deterministic one-way protocol P counts 
the number of different messages sent by P. The message complexity of L is the 
message complexity of the best one-way protocol for L. The nondeterministic 
message complexity of L, nmc{L), is the minimum over all nondeterministic 
protocols recognizing L of the number of messages exchanged over all inputs. It 
is well known IF b 841 that for nondeterministic communication one-way protocols 
are optimal. 

Bounded ambiguity protocols have been investigated in \vm\. pn|. puqTTwn^ . 
protocols with bounded advice in 



3 Communication Complexity and Finite Automata 



Duris, Hromkovic, Rohm, and Schnitger pHR,S9i^ (see also jH8bj l observed that 
the minimal number of states of a dfa recognizing L can be characterized by the 
communication complexity of L. 



Fact 1 For every regular language L: mc{L) = s{L). 

The lower bound s{L) > mc{L) can easily be generalized to nondeterministic 
and probabilistic automata. 

Nondeterministic communication complexity seems to provide the best lower 
bounds on ns{L). All previously known techniques like the fooling set approach 
are special cases of this approach. Moreover the fooling set method, which covers 
all previous efforts in proving lower bounds on ns(L), can (for some languages) 
provide exponentially smaller lower bounds than the method based on nonde- 
terministic communication complexity [[DHS96j . 

The first question is therefore whether nmc{L) can be used to approximate 
ns{L). Unfortunately this is not possible. 



Lemma 1 There exists a regular language PART such that 
ns{PART) > 

Proof Sketch; Let PART = {xyz : |a;| = \y\ = \z\ = n, and x ^ z\/ x = y}. 
First we describe a nondeterministic protocol for PART using 0{n^) messages. 
Players Cj and Cn compute the lengths Ij, In of their input. Cj communicates 
Ij and Cjj rejects when Ij -|- In ^ 3n. So we assume that Ij -|- In = 3n in the 
following. 

Case 1: Ij < n. 

Cj chooses a position 1 < i < // and communicates i,Xi,lj. Cn accepts, if 
Xi ^ Zi- Otherwise Cn accepts if and only y = z. 

Note that \i x ^ z, then there is an accepting computation. If however x = z, 
then Cn accepts iff j/ = z, that is iff x = y. 
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The other cases n < Ij < 2n and 2n < Ij < 3n are handled similarly using 0{n?) 
messages. 

We give the following lower bound on ns{PART). Assume that the input is 
xyx. Then the nfa A has to test whether x = y. A can be simulated by a 
2 round nondeterministic protocol, where Cj holds x and Cu holds y, using 
communication 2log sizca- Then size a must be at least □ 

To find lower bound methods for ns{L) that provide results at most polynomially 
smaller than ns{L) is one of the central open problems on finite automata. In 
the following we concentrate on lower bounds for nfa’s with constant ambiguity. 
Even for unambiguous automata no nontrivial method for proving lower bounds 
has been known up to now. 

To present our method we represent regular languages as communication matri- 
ces of infinite size. For an alphabet E and sets A, F C S* the communication 
matrix of a language L is the infinite Boolean matrix M{L, E, X,Y) = 
for X G X,y G Y , where ax,y = 1 iS xy G L. We will use the shorter notation 
M{L), where E,X,Y are clear from the context. 

In pHB,S97] it was proved that the number of different rows of M{L) equals 
mc{L) which in turn equals s{L). Since the number of different rows is finite for 
all regular languages we can define rank(^{M{L)) as the rank of M{L) over the 
rational numbers Q. 

Theorem 1 For every regular L C E* 

a) uns{L) > rankt^{M{L)) 

b) nsk{L) > rarik(^{M(L)y/^ — 1. 

c) nsk(L) > - 2. 

Proof: Let A be an optimal unfa for L. A can be simulated by a one-way non- 
deterministic protocol as follows: Cj simulates A on its input and communicates 
the obtained state. Cu continues the simulation and accepts/rejects accordingly. 
Obviously the number of messages is equal to size a and the protocol works with 
unambiguous nondeterminism. 

It is easy to see that the messages of the protocol correspond to size a subma- 
trices of the matrix M{L) covering all ones exactly once. Hence the rank is at 
most size A and we have shown a), which is related to the rank lower bound on 
communication complexity piS82j . 

For b) observe that the above simulation induces a cover of the ones in M{L) so 
that each one is covered at most k times. By the following fact from [K INSWd^ 
we are done: 

Fact 2 Let denote the minimal size of a set of submatrices covering the 

ones of a Boolean matrix M so that each is covered at most r times. Then 

(1 -I- Kr{M)Y > rank{M). 

For the other claim again simulate A by a one-way /c-ambiguous nondeterministic 
protocol with sizcA messages. 
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Results of [KNSWQI] (see also [IL90) . |Y91j l imply that a fc-ambiguous nonde- 
terministic one-way protocol with m messages can be simulated by a determin- 
istic two-way protocol with communication log(m*^ + 1) ' k ■ log(m -1-2). Thus 
cc(L) < log(sJze^-|-l)-fc-log(szzeA + 2) < log^((stze^ -I- 2)^) and c) follows. □ 
Before giving an application of the lower bound method we point out that nei- 
ther nor rank(^{M{L)) is a lower bound method capable of proving 

polynomially tight lower bounds on the minimal size of unfa’s for all languages. 
In the first case this is trivial, in the second case it follows from a modification 
of a result separating rank from communication complexity (see jKN97j l. But 
the lower bounds may still be pseudopolynomial. 

Now we apply theorem 1 in order to prove an exponential gap between ns{L) 
and uns{L) for a regular language. 

Corollary 1 There is a language NIDm with the following properties: 

a) NIDm can be reeognized by an nfa A with ambiguity 0{m) and size 0{m) 

b) Any nfa with ambiguity k for NIDm has size at least 2"®/^ — 1, and in 
particular any unfa for NIDm must have 2"* states. 

c) No nfa with ambiguity o{m/ logm) for NIDm has polynomial size in m. 
Proof: Let NIDm = {u G {0, l}*|3i : Ui ^ Ui+m}- 

a) First the nfa guesses a residue i modulo m, and then checks whether there is 
a position p = i mod m with Up ^ rtp+m- 

b) Observe that the submatrix spanned by all words u and v with u,v G {0, 1}'" 
is the “complement” of the 2™ x 2’” identity matrix. The result now follows from 
parts a) and b) of theorem 1. 

c) is an immediate consequence of b) . □ 

4 Degrees of Nondeterminism in Finite Automata 

It is easy to see that adviceA{n) < leafA^n) < 2 ^AdviceA(n)) 

ambigA{n) < leafA^n) for every nfa A. The aim of this section is to investigate 

whether stronger relations between these measures hold. 

Lemma 2 For all nfa A either 

a) advice A{n) < size a and leafA(n) < size^2^'^^ or 

b) adviceA^n) > nfsizeA — 1 and leafA{n) > nfsizeA — 1- 

Proof: If some reachable state q of A belongs to a cycle in A and if q has two 
edges with the same label originating from it such that one of these edges belongs 
to the cycle, then advice a(ji) > (n — sizejf)j sizeA > nfsizeA ~ 1- Otherwise 
for all words all states with a nondeterministic decision are traversed at most 
once. □ 

Our next lemma relates the leaf function to ambiguity. A state q of an nfa A 
is called terminally rejecting, if there is no word and no computation of A, such 
that A accepts when starting in q, i.e., 6*{q,v) contains no accepting state for 
any word v. Clearly there is at most one terminally rejecting state in a minimal 
automaton, because otherwise these states can be joined reducing the size. Call 
all other states of A undecided. 
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Lemma 3 Every nfa A with at most one terminally rejecting state satisjies 
lso,fA{x) < ambigA{\x\ + sizCA) ■ \x\ ■ sizcA + 1 



for all X. 

Proof; Let k = ambigA{\x\ + sizcA)- If the computation tree consists only 
of nodes marked with the terminally rejecting state, then the tree consists of 
just one leaf and the claim is trivial. For the general case, consider a level of 
the computation tree of A on a; that is not the root level. Assume that the 
level contains more that k ■ sizcA nodes labeled with undecided states (called 
undecided nodes) . Then one undecided state q must appear at least fc + 1 times 
on this level. There are k + 1 computations of A on a prefix of x such that q is 
reached. If q is accepting, then the prefix of x is accepted with k+1 computations, 
a contradiction, since ambigA is monotone. If q is rejecting, but undecided, then 
there is a word v of length at most sizcA such that v is accepted by some 
computation of A starting in q. But then the prefix of x concatenated with v is 
accepted by at least k + 1 computations, a contradiction. 

Thus each level of the tree that is not the root level contains at most k ■ size a 
undecided nodes. Overall there are at most |x| • k ■ size a + 1 undecided nodes. 
Observe that each node has at most one terminally rejecting child. Thus the 
number of terminally rejecting leaves is equal to the number of undecided nodes 
that have a terminally rejecting child. Hence the number of terminally rejecting 
leaves is at most the number of undecided nodes minus the number of undecided 
leaves. Thus the overall number of leaves is at most the number of terminally 
rejecting leaves plus the number of undecided leaves which is at most the number 
of undecided nodes. So overall there are at most k -\x\- sizcA + 1 leaves. □ 

Theorem 2 Every nfa A with at most one terminally rejecting state satisfies 

advice A{n),ambigA{n) < leafA{n) < 0{ambigA{n) ■ adviceA^n)). 

Especially for any such unfa: advice A{n) = 0{leafA(ji)). 

Proof; Observe that for all n; ambigA{n) = Q{ambigA{n+0 {!))), since ambigA 
is monotone and at most exponential. □ 

Next we further investigate the growth of the leaf function. 

Lemma 4 For every nfa A either leafA(ji) < (n • or leafA{ri) > 

2^f2{n) 



Proof; Assume that an nfa A contains some state q, such that q can be entered 
on two different paths starting in q, where each path is labeled with the same 
word w. It is not hard to show that in this case there are two different paths 
from q to q labeled with a word w of length size^ — 1- Then the computation 
tree of uw"^ (where u leads from the starting state to q) has at least 2"* > 
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Now assume that A does not contain such a state. Then for each nondeterministic 
state q (i.e., a state with more than one successor for the same letter) and any 
computation tree the following holds: If q is the label of a vertex v, then q appears 
in each level of the subtree of v at most once. 

We prove by induction over the number k (k < size a) of different nondetermin- 
istic states in a computation tree that the number of leafs is at most (n • sizcA)^ ■ 
The claim is certainly true if there are no nondeterministic states. 

Assume that there are k nondeterministic states, with some state qi appearing 
first in the tree. Observe that no level in the entire computation tree contains 
qi more than once. 

For each occurence of qi in the computation tree fix some child, so that the 
overall number of leaves is maximized. We get a tree with one nondeterministic 
state less, and by the induction hypothesis this tree has at most (n • 
leaves. 

Since qi appears at most once on each level and since there are at most sizcA 
children of qi on each level, there are at most (n ■ sizca)^ leaves. □ 

Lemma 2 and 4 give us 

Theorem 3 For all A: leafA{n) is hounded by a constant, or is between linear 
and polynomial in n, or is 2®^”^ . 



Now we consider the difference between polynomial and exponential ambiguity 
resp. polynomial and exponential leaf number. We show that languages which 
have small automata of polynomial ambiguity are related to the concatenation of 
languages having small unfa’s. If the language is a Kleene closure, then one unfa 
accepts a large subset. Compare [(IK W91)j . where Kleene closures are shown to 
be recognizable as efficient by nfa’s with constant advice as by dfa’s. 



Theorem 4 a) Let L be an infinite regular language and A some nfa for L 
with polynomial ambiguity. Then there are k < size a languages Li such that 
Li ’ ' ' Lk Q L, Li is recognizable by an unfa with O (size a) states, and 

iLi-.-LfcnA-l ^ , 

^ = J^(l) for infinitely many n. 

b) Let L = (Kff)* for a regular language K not using the letter ff and let A be 
some nfa for L with polynomial ambiguity. Then for all m there is an unfa A' 
with 0{sizeA) states that decides L' C L such that for infinitely many n 



|L'n(A™nA:)#)"| 

|((A™nK)#)"| 



n{l/poly{n)). 



Proof Sketch: a) Define the ambiguity graph of A in the following way: the 
nodes are the (reachable) states of A and there is an edge from qi to qj if there 
are two paths from qi to qj in A with the same label sequence. Note that the 
ambiguity graph is acyclic iff the ambiguity of A is polynomially bounded as we 
have seen in the proof of lemma 4. 
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We construct now a unfa Aij^k which accepts those words that lead in A from qi 
to qj and then via one edge to qu- Here we assume that the longest path from qi 
to qk in the ambiguity graph consists of one edge and q^ is reachable from qi in 
A, but not in the ambiguity graph. Moreover, we demand that there is an edge 
in A from qj to qu- 

The states of Ai_j^k are the states reachable in A from qi, but not reachable in 
the ambiguity graph from qi, plus the state qk- The edges are as in A except 
that the only edges to qk come from qj. qi is the start. Accepting state is qk- 
Lij,k is the language accepted by Aij^k- 

Now consider the words w G LD A”. Each such word is accepted on some path 
in A leading from go to some accepting state qa- Fix one such accepting state so 
that a constant fraction of all words w is accepted and make the other accepting 
states rejecting. On an accepting path for w the states appear without violating 
the topological ordering of the ambiguity graph. So we may fix a sequence of 
states go, 9*1, • • • ,9a so that w G To.*i.i 2 ^* 2 .i 3 .i 4 ’ ’ ’ ^i 2 k- 2 ,i 2 k-i,a- Since there are 
only finitely many such sequences we are done. 

b) Similar to a) we get k languages Li,...,Lk decidable by small unfa’s Ai, 
such that ^ = •^(1) for infinitely many n. 

A partition of the letters of words in is given by mapping the nm letters 

to the k unfa’s. There are at most (^.”^) • (m+ 1)^“^ possible partitions. So some 
partition must be consistent with accepting paths for a fraction of \/poly{n) of 
((A™ n Fix one such partition. Then for each words w G an 

unfa is responsible for some prefix u, followed by a concatenation of words of 
the form and finally a word of the form ^v. For all f we fix a prefix Ui, a 

suffix Vi, and states qi,q[ entered when reading the first and final occurence of 
#, such that as many words from ((A’” n AT)#)" as possible are accepted under 
this fixing. At least a fraction of size~^ = l/poly{n) of ((A™ fl AT)#)” 
has accepting paths consistent with this fixing. 

If any Ai accepts less than a polynomial fraction (compared to the projection 
of (A™ n K)^)'^ to the responsibility region of Ai) then overall less than a 
polynomial fraction is accepted. Hence one Ai can be found, where from qi a 
polynomial fraction of words in leads to non-terminally rejecting 

states in Ai. Making one non-terminally rejecting state reached by a # edge 
accepting and removing the original accepting states yields an unfa that accepts 
the desired subset for infinitely many n. □ 

Corollary 2 There is a family of languages KLm such that KLm can he recog- 
nized by an nfa with advice 0{n), leaf 2^ (md size poly{m), while every nfa 
with polynomial leaf number/ ambiguity needs size to recognize KLm- 

Proof; Let LNDISJm = • yi ■ ■ ■ ym\xi,yi encode elements from a 

size universe and the sets UiXi and U,v,- intersect nontriviallyj. Then let 
KL^ = {LNDISJ^ff)*. 

Given a polynomial ambiguity nfa for ATL^ we get an unfa accepting a fraction 
of l/poly{n) of {LNDISJmff)"' for infinitely many n by theorem 4b). Then we 
simulate the unfa by a nondeterministic communication protocol, where player 
Cj receives all x and player Cu all y inputs. The protocol needs 0{n ■ log size a) 
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bits to decide correctly on a l/poly{n) fraction of {LN DI S and has un- 
ambiguous nondeterminism. A result from implies that this task needs 

communica tion Q{nm) and thus size a > . □ 

So we have a strong separation between the size of automata with polynomial 
ambiguity and the size of automata with exponential ambiguity. The situation 
seems to be more complicated, if one compares constant and polynomial ambi- 
guity. Here we can only show that there is a family KON ^2 of languages with 
small size nfa’s of polynomial ambiguity, while nfa’s of ambiguity m are expo- 
nentially larger. In the following theorem we describe a candidate for a language 
that has efficient nfa’s only when ambigui ty is polynomial. Furthermore the lan- 
guage exhibits an almost optimal gap between the size of unfa’s and polynomial 
ambiguity nfa’s. In the (omitted) proof the rank of the communication matrix 
of KONra is shown to be large by a reduction from the disjointness problem. 

Theorem 5 Let KONm = {0, l}*0iWm0{0, 1}*, where Mm contains all words 
in {0, 1}* with a number of ones that is divisible by m. KONm can be recognized 
by an nfa A with arnbigA(n),leaf a(ci) = 0(n) and size m + 2, while any nfa 
with ambiguity k for KONm needs at least — 2 states. 

For every m the language KONm^ of theorem 5 can be recognized with size 
0{wf), leaf number and ambiguity 0{n), and advice 0(n), while every m— ambi- 
guous nfa has size 2^^^\ We conjecture that the language KONm cannot be 
decided by nfa’s with constant ambiguity (even larger than m) and size poly(m). 

5 Conclusions and Open Problems 

We have shown that communication complexity can be used to prove size lower 
bounds for nfa’s with small ambiguity. This approach is limited, because for 
nontrivial bounds ambiguity has to be smaller than the size of a minimal nfa. 
Is it possible to prove lower bounds for automata with arbitrarily large constant 
ambiguity, when small equivalent polynomial ambiguity automata exist? 

In this context, it would be also of interest to investigate the fine structure of 
languages with regard to constant ambiguity. At best one could show exponential 
differences between the number of states for ambiguity k and the number of 
states for ambiguity fc -I- 1. Observe however, that such an increase in power 
is impossible provided that the size of unfa’s does not increase substantially 
under complementation iKnni. Analogous questions apply to polynomial and 
exponential ambiguity. 

Are there automata with nonconstant but sublinear ambiguity? A negative an- 
swer establishes theorem 3 also for ambiguity as complexity measure. 

Other questions concern the quality of communication as a lower bound method. 
How far can rank resp. be from the minimal unfa size? Note that the 

bounds are not polynomially tight. Are there alternative lower bound methods? 
Finally, what is the complexity of approximating the minimal number of states 
of an nfa? 
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Abstract. A long standing open problem in the theory of (Mazurkie- 
wicz) traces has been the question whether LTL (Linear Time Logic) 
is expressively complete with respect to the first order theory. We solve 
this problem positively for finite and infinite traces and for the simplest 
temporal logic, which is based only on next and until modalities. Similar 
results were established previously, but they were all weaker, since they 
used additional past or future modalities. Another feature of our work is 
that our proof is direct and does not use any reduction to the word case. 
Keywords Temporal logics, Mazurkiewicz traces, concurrency 



1 Introduction 



Nowadays, it is widely accepted that we need to develop methods to verify crit- 
ical systems. For this, we need formal specifications for the expected behaviors 
of systems. Most often, these formal specifications are given by temporal logics 
formulae. The task is even harder when considering concurrent systems. Indeed, 
the usual approach is to reduce concurrent systems to sequential ones by consid- 
ering linearizations of concurrent executions. This allows to use techniques and 
tools developed for sequential systems but introduces a combinatorial explosion 
which has then to be fought by all means. Instead, one could try to work directly 
on concurrent systems and this explains why a lot of research has been devoted 
recently to the study of temporal logics for concurrency. A major aim is to find a 
temporal logic which is expressive enough to ensure that all desired specification 
can be formalized. 

For sequential systems, Kamp’s Theorem |H| states that the linear time logic 
has the same expressive power as the first order theory of words. Originally 
Kamp used future and past modalities, but it was established later by Gabbay’s 
separation theorem 0 that past modalities can be avoided. 

Trace theory, initiated by Mazurkiewicz is one of the most popular set- 
tings to study concurrency. See j2j for the general background of trace the- 
ory, and in particular m for traces and logic and [Zj for infinite traces. It 
is no surprise that various temporal logics for traces have been extensively 
studied pilllil IH2ll4llbllbj . A long standing open problem [41 1 7j was to know 
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whether LTL (the basic linear temporal logic) is expressively complete with re- 
spect to the first order theory of finite and infinite traces. 

The first completeness result for traces is from Ebinger He proved that a 
linear temporal logic with both past and future modalities is expressively com- 
plete for finite traces but his approach did not cope with infinite traces. Then, 
Thiagarajan and Walukiewicz proved the completeness both for finite and 
infinite traces of LTrL, a linear temporal logic with future modalities and past 
modalities in the restricted form of past constants. These two results were ob- 
tained using a reduction to the word case. In |0j it was claimed that LTL is 
expressively complete for finite traces, but the proof contained a serious mis- 
take. The expressive completeness for a pure future temporal logic (LTLy) was 
first established in ^j. The result holds for finite and infinite traces, but LTLy 
contains new filter modalities in addition to the usual next and until modalities. 
Thus, the problem of the expressive completeness of the basic linear temporal 
logic was still open. 

In this paper, we solve this problem positively. Previous expressive com- 
pleteness results for words and for traces are now formal corollaries of our main 
theorem. It should be noted that, contrary to most previous works, our proof 
does not use any reduction to the word case. Instead, we extend the new proof 
introduced by Wilke for finite words HHI which is based on the well-known fact 
that first order languages are aperiodic. We follow here the same approach as 
in j2]. In the former paper filter modalities were used to express some special 
products of trace languages. Our new proof is a substantial revision of the pre- 
vious one. We are now able to express the products mentioned above directly 
with basic formulae without using the filter modalities. 

2 Preliminaries 

By (S,I) we mean a finite independence alphabet where S denotes a finite 
alphabet and / C T" x if is an irreflexive and symmetric relation called the 
independence relation. The complementary relation D = (if x if)\ / is called the 
dependence relation. The monoid of finite traces M(if , I) is defined as a quotient 
monoid with respect to the congruence relation induced by /, i.e., M(if, 7) = 
if*/{ ab — ba \ (a, b) G I}. We also write M instead of M(if, I). For H C if we 
denote by the submonoid of M(if, I) generated by A: 

Ma = M(A, lnAxA) = {xG M(if, I) I alph(cc) C A}. 

A trace a; G M is given by a congruence class of a word oi • • • a„ G if* where 
Oi G if, 1 < n. By abuse of language, but for simplicity we denote a trace x by 
one of its representing words Oi • • • a„. The number n is called the length of x, it 
is denoted by |a;|. For n = 0 we obtain the empty trace, it is denoted by 1. The 
alphabet alph(a;) of a trace x is the set of letters occurring in x. A trace language 
is a subset ACM. The concatenation is defined as usual: 



KL = {xy G M. \ X G K,y G L}. 
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Every trace ai • • • a„ C M can be identified with its dependence graph. This 
is (an isomorphism class of) a node-labeled, acyclic, directed graph [E, E, A], 
where V = {1, . . . , n} is a set of vertices, each i € V is labeled by A(i) = at, 
and there is an edge {i,j) G if if and only if both i < j and (A(z), A(j)) G D. 
In pictures it is common to draw the Hasse diagram only. Thus, all redundant 
edges are omitted. For instance, let (S,D) = a — b — c, i.e., I = {(a,c), (c,a)}. 
Then the trace x = abcabca is given by 



\ 



An infinite trace is an infinite dependence graph [E, E, A] such that for all 
j £ V the set i j = {i £ V \ i < j} is finite. A real trace is a finite or 
infinite trace. The set of real traces is denoted by or simply by R. For 

a real trace x = [V, if , A] the alphabet is alph(a;) = A“^(F) and the alphabet at 
infinity is defined by the set of letters occurring infinitely many times in x, i.e., 
alphinf(x) = {a G Af | |A“^(a)| = oo}. See 0 for details about infinite traces. 
Usually we consider real trace languages. If we speak about finitary languages, 
then we refer to subsets of M. As for finite traces, if A C Af is a subalphabet 
then we let = {a: G R. | alph(a;) C A}. 

By min(a:) and max(a:) we refer to the minimal and maximal letters in the 
dependence graph. (We shall use the notation max(a;) only for finite traces x £ 
M.) In the example above min(a;) = {a} and max(x) = {a,c}. Formally: 



min(x) = {a G Af I X G oR}, 
max(x) = {a G Af I X G Ma}. 



For BCE and #G {C, =, D, yf} we define: 



I{B) = {a £ E \ yb £ B (a,b) £ I}, 
D{B) = {a £ E \ 3b £ B : (a,b) £ D}, 
(Min B) = {x G R I min(x) # B}, 

(Max # B) = {x G M I max(x) # B}. 



Note that (Max ff B) denotes a finitary language, whereas (Min ff B) is a real 
trace language. If B happens to be a singleton, then we usually omit braces, e.g. 
we write I{b) or (Max = b). 



3 Temporal Logic for Traces 

The syntax of the temporal logic LTL(Af) is defined as follows. There are a 
constant symbol T representing /aZse, the logical connectives -• (not) and V (or), 
for each a £ E a unary operator (a), called next-a, and a binary operator U, 
called until. Formally, the syntax is given by: 
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::= _L I I V I {a)ip \<^\J ip, 

where a G S. 

The semantics is usually defined by saying when some formula ip is satisfied 
by some real trace z at some configuration (i.e., finite prefix) x; hence by defining 
(z, x) 1= ip. Since our temporal logic uses future modalities only, we have (z, x) |= 
ip if and only if (y, 1) ^ (p, where y is the unique trace satisfying z = xy. 
Therefore, we do not need to deal with configurations and it is enough to say 
when a trace satisfies a formula at the empty configuration, denoted simply by 
z \= ip. This is done inductively on the formula as follows: 

2 ^ -L, 

z\=^ip \i z^= ip, 
z \= py Ip if z 1= or z 1= '0) 

z \= {a) p if z = ay and y \= p, 

z \= p\J ijj if z = xy, X G M, y \= Ip, and x = x'x", x" yf 1 implies x"y ^ p. 

As usual, we define Lr(<^) = {x £ K | x ^ (/?}. We say that a trace language 

L C R is expressible in LTL(A), if there exists a formula p G LTL(A') such that 
L = Lr((/j). 

Equivalently, we can define inductively the language Lr(<p) as follows: 

Lk(-L) = 0 

= R\Lr(<p) 

Lr(<p V 0) = Lr((/?) U Lr(0) 

LR((a)(/?) = oLr((/9) 

Lr(</j U 0) = Lr((/?) U Lr(0) 

where the until operator U is defined on real trace languages by 

L\J K = {xy I X G M, y G AT, and x = x'x", x" yf 1 implies x"y G L\. 

Remark 1. For comparison let us mention that the syntax and semantics of the 
logics LTrL defined in fZj and LTLy defined in are very similar. For LTrL, the 
difference is only that there is in addition for each letter a G A a constant (a“^)T. 
Since the constant (a“^)T refers to the past, we need to use configurations to 
define its semantics: It holds (z,x) \= {a~^)T if and only if o G max(x). For 
LTLy, the difference is that for each subalphabet B S there is in addition a 
modality {B*)p whose semantics is given by z ^ {B*)p if z = xy, x G Ms, and 
y \= p.lt is not clear whether there is a direct translation of LTrL or LTLy to 
LTL; or between LTrL and LTLy. 

The following operators are standard abbreviations. They serve as macros. 



T 


= -i± 


true. 


a 


= (a)T 


for a G S, 


A 


= V.eA(«)T 


for ACE, 


Xp 


= \/aes{o)‘f 


neXt p. 


stop 


= -XT 


termination. 


Fp 


= TVp 


future or eventuallyp. 


Gp 


= 


globally or alwaysp. 
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Example 1. Lj{(stop) = {!}, LR(Fstop) = M and ior AC S we have: 

Ra = Lr(-i f(^ \ ^)), 

Ma = Lr(F stop A - F(i: \ ^)), 

(Min = A) = LR(^(r \A)A a), 

(Max D ^) = LR(Aag^ F( a) stop), 

(alphinf = A) = Lr(F G \ A) A G F a). 



Remark 2. Later we shall perform an induction on the size of E leading to 
formulae (p G LTL(yl) for ACS. Such a formula may be interpreted over 
or over R and we have Lr^(i^) = Lr((^) fl Ra- Hence Lr^((^) is expressible 
in LTL(i7). Another trivial observation is that if r/; G LTL(A), then we can 
construct a formula At G LTL(A) such that Lr(A) H Ra = Lr^(Aa)- 

We will also use an induction on | Aj when the graph {S, D) is not connected. 
Then we find a partition of the alphabet S = Si\J S 2 with Si x S 2 C I and 
each trace a: G R can be split into two independent traces x\ and X 2 over S\ and 
S 2 with X = X\X 2 = X 2 X 1 . The following lemma can be shown by induction. 

Lemma 1. Assume that S = Si U S 2 with Si x S 2 C I. Let R^ = Ri;. and 
Pi G LTL(ri) for i = 1,2. Then Lr(vji A <^ 2 ) = LrA</?i) • LrAv32)- 

4 First-Order Logic 

The first order theory of traces is given by the syntax of FO(i7, <): 
p ::= Pa{x) \ X < y \ ^p \ p\/ p \ {^x)p, 

where a € S and x,y G Var are first order variables. Given a trace t = [V, E, A] 
and a valuation of the free variables into the vertices a : Var — >■ V, the semantics 
is obtained by interpreting the relation < as the transitive closure of E and the 
predicate Pa{x) by A(a(a;)) = a. Then we can say when (t, u) \= p. li p \s & 
closed formula (a sentence), then the valuation a has an empty domain and we 
define the language Lr((/j) = {f g R | t ^ p}. We say that a trace language 
L C R is expressible in FO(A, <) if there exists some sentence p G FO(A7, <) 
such that L = Lr((^). 

Passing from a temporal logic formula to a first order one is not very difficult. 
It is well-known or belongs to folklore. The transformation relies on the fact that 
a prefix (configuration) p of a trace t can be defined by its maximal vertices. 
Such a set of maximal vertices is bounded by the maximal number of pairwise 
independent letters in S. Therefore, a prefix inside a trace can be defined using 
a bounded number of first order variables. 

Proposition 1. If a trace language is expressible in LTL(A7), then it is express- 
ible in FO(A, <). 
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5 Aperiodic Languages 

Recall that a finite monoid S is aperiodic, if there is some n > 0 such that 

_ gji+i fQj. s G S. A finitary trace language L C M is aperiodic, if 
there exists a morphism to some finite aperiodic monoid h : M. ^ S such that 
L = h~^{h{L)). Since our considerations include infinite traces, we have to 
extend the notion of an aperiodic language to real trace languages such that it 
becomes equivalent for finitary languages. 

Let /i : M — >■ S' be a morphism to some finite monoid S. For z, y G K., we say 
that X and y are ^.-similar, denoted by x y if either x, y G M and h{x) = h{y) 
or X and y have infinite factorizations in non-empty finite traces x = X\X2 ■ ■ ■, 
y = 2/12/2 • • ■ with Xi, G M \ { 1 } and h{xi) = h{yi) for all i. According to the 
definition of /i-similarity, we never have x y when x is finite and y is infinite. 
We denote by the transitive closure oi which is therefore an equivalence 
relation. An equivalence class is denoted by = {y G R \ y x}. For a 

finite trace x G M we have [x]-,^ = h~^{h{x)) and the monoid M is covered 
by at most |S| classes. Using a Ramsey-type argument we can show that K.\M 
is covered by at most |Sp classes. Therefore, | a; G K} defines a finite 

partition of K. of cardinality at most |S'p -|- [S'!. A real trace language L C K 
is recognized by h if it is saturated by '^h, i-e., x G L implies [x]~,^ C L for 
all X G K. A real trace language L C K is aperiodic if it is recognized by some 
morphism to some finite and aperiodic monoid. 

In order to prove the main theorem we shall use the equivalence between 
FO(A, <)-definability and aperiodic languages. 

Theorem 1 im)- A finitary trace language (real trace language resp.) over 
an independence alphabet {S, I) is expressible in FO(A, <) if and only if it is 
aperiodic. 

We conclude this section by some easy results which will be useful later. The 
first proposition will allow us to use an induction on the size of the alphabet S. 

Proposition 2. Let L C R &e a language recognized by the morphism h :M. ^ S 
into a finite monoid S and let AG_ S. Then, LPiR-a o^nd LflM^ are recognized 
by the restriction /ifM^ of h to M^. 

Then, we consider the case where the dependence alphabet (A, D) is non- 
connected. A kind of Mezei’s Theorem holds for real trace languages too: 

Proposition 3. Let L Q IS. be a language recognized by the morphism h :M. ^ S 
into a finite monoid S . Assume that S = S\ U S2 with Ei x S2 Q I and let 
Mi = Mi = M^;^ and hi = h\Mi for i = 1 , 2 . Then, L is a finite union of 

products Li ■ L2 where Li C Mi is recognized by hi for i = 1 , 2 . 

Let /i : M — >■ S' be a morphism to some finite monoid S and let L C M. For 
s G S we define L(s) = {x G M | h~^{s)x fl L yf 0 }. 

Proposition 4. Lf L is recognized by h then L = 
each s G S, the language L{s) is recognized by h. 
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6 Composition of LTL Languages 

In the next section we will prove that each aperiodic trace language is expressible 
in LTL(I7) using some induction. For this, we will write an aperiodic language 
as a finite union of products of simpler languages. By induction, the simpler 
languages will be shown to be expressible in LTL(i7). Then we have to prove 
that their products and unions are still expressible in LTL(i7). Since union cor- 
responds with disjunction, the only problem is with concatenation. Though it 
is true in general that the product of languages expressible in LTL(I7) is still 
expressible in LTL(if), we do not have a direct proof for this general result. 
Actually, this becomes a consequence of our main theorem since aperiodic trace 
languages are closed under product. In this section we show that some special 
products of expressible languages are still expressible in LTL(Z'). We state now 
two crucial composition lemmas. The proof techniques are similar, for lack of 
space we show the proof of Lemma 01 only. 

Lemma 2. Let b G S, B = S\ {5} and II = IT (Max C D{b)). Let Li, L2 ^ 
K and L3 C (Min = b) he trace languages expressible in LTL(A'). Then the 
language (Li T 77) (L2 H M/(b))L3 is expressible in LTL(A). 

Lemma 3. Let b G S and B = i7\{6}. Let Li C R and L 2 C Mg be expressible 
in LTL(A'). Then the language {L\ T (Max C {6}))L2 is expressible in LTL(A'). 

Proof. The product (Max C {6})Mg is unambiguous and we have 

(Li n (Max C {b}))L 2 = {Li T (Max C {6}))Mg T (Max C {b})L 2 - 

1) We first show that (Li T (Max C {6})) Mg is expressible in LTL (A). This is 
done by induction on the formula which defines Li. The cases T and ip\/ are 
trivial. 

• -<(p : Since the product (Max C {6})Rg is unambiguous, we have 

(Lr(-k^) n (Max C {6}))Mg = (Max C {6})Rg \ (Lr(<p) T (Max C {&}))Mg. 

Moreover (Max C {6})Mg is the set (Alphinf C 77), which is expressible in 
LTL(A) by the formula F G ~'b. 

• {a)(p: The language (LR((a)</j) T (Max C {6}))Rg is equal to the language 
a^(LR((/j) n (Max C {6}))Rg T (a“^(Max = b))'^^'^- We conclude by induction 
since we have (a“^(Max = 6))Mg = Lr('0) with 

if= y F{bi)---F{bk)hFb) 

a— bo —bi bk—b 

where the disjunction ranges over all simple paths from a to 6 in the graph of 
the dependence alphabet (S,D). 

• ip \J : This case follows by induction from the formula 

(Lr(<p IJ ip)n (Max C {6 }))Mb = 

(Lr((/?) n (Max C {6}))Mg U (Lr(V’) T (Max C {6}))Mg. 
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2) Next, if ip G LTL(N’) is a formula such that Lr(<p) C Mg then we have 
(Max C {6}) • Lr((/?) = Lr((f b) U p). 

7 Kamp’s Theorem for Real Traces 

Theorem 2. A finitary trace language (real trace language resp.) is expressible 
in FO(N’, <) if and only if it is expressible in LTL(i7). 

By Proposition [D and Theorem Q] it is enough to show that all aperiodic 
languages in M(if, I) are expressible in LTL(T’). The overall strategy is as in |2|, 
but we must not use any filter operator. The filter operators allowed us to express 
directly a language of type for T C if and ip G LTL. This simplified the 

proof considerably, but cannot be used here. We can circumvent these difficulties 
thanks to the results of the previous section. 

Let Q be a finite set of states. We denote by Trans(Q) the monoid of map- 
pings from Q to Q. The multiplication is the composition (in reverse order) of 
mappings: {fg){x) = g{f{x)); and the unit element is the identity idg. We will 
use the fact that every finite monoid S can be realized as a submonoid of some 
Trans(Q) where \Q\ < [S'!. Indeed, it suffices to consider the right action of S 
over itself. More precisely, if we define %(s) G Trans(S') by x(s)(f) = ts then it 
is easy to see that % : S' — >■ Trans(S) is an injective morphism. 

We deduce that every aperiodic trace language can be recognized by some 
morphism h : M(if, /) — t S C Trans(Q) where S is aperiodic. We show by 
induction on (|Q],|T’|) that all languages recognized by h are expressible in 
LTL(27). We use the following well-founded lexicographic order on for this 
induction: (m, n) < (m', n') if and only ii m < m' or m = m' and n < n' . 

First, assume that h{a) = idg for all a G S, which is in particular the case 
when \Q\ = 1. If L C M is recognized by h then L is one of the sets 0, M, M or 
M \ M which are respectively defined by the formulas _L, T, F stop, and G X T. 
This shows the basis case of the induction. 

Second, assume that h{b) yf idg for some b G S. The crucial observation here 
is that h{b) is no permutation of Q. Indeed, since S is aperiodic, there exists 
some n such that h{b)'^ = /i(6)"“*'^. If h{b) were a permutation then it would be 
invertible and the last equality would imply h(b) = idg, a contradiction. Hence, 
h{b){Q) — Q' for some Q' C Q with \Q'\ < \Q\. 

We let H = X \ {6} and we define two subsets of M: 

n = {x G Mg I max(x) C D{b)}, 
r = {x G Mg I min(a;) C D{b)}. 

The notation 77 is chosen since Ub are exactly the pyramids of M where the 
unique maximal element is b. It should be noted that (776)* and (Fb)* are free 
submonoids of M, both being infinitely generated if 77(6) {6}. 

By A we denote the subset of real traces x which are either in M6 or which 
can be factorized into an infinite product of finite traces such that all factors 



LTL Is Expressively Complete for Mazurkiewicz Traces 219 



except the first belong to Fh, that is, A = The set A, which plays a 

key-role, admits the following unambiguous decomposition: 

Z\ = 7TM/(fa)6(T6)“. 

This decomposition is best visualized by the following picture; it is in some sense 
the guide for the modular construction of a formula defining the language LC\A. 




The core of the proof is now the following proposition. 

Proposition 5. Let L C M &e recognized by h. Then, L A is expressible in 
LTL(r). 

The proof of this proposition will be given later in Section [01 Here we show 
how it is used in the proof of Theorem 0 We start with two corollaries. 

Corollary 1. Assume that {S, D) is connected and let L Q M. be recognized by 
h. Then the language L fl (Alphinf = E) is expressible in LTL(i7). 

Proof. Since {E, D) is connected, we have (Alphinf = A) C Z\. Therefore, 

L n (Alphinf = A) = (L n Z\) n (Alphinf = A). 

By Proposition Owe know that LlTZ\ is expressible in LTL (A) and we conclude 
easily since (Alphinf = A) = LR(Aae i; G F a). 



Corollary 2. Let c € E, C = E \ {c} and let L C be recognized by h. Then 
the language L fl (Alphinf C C) is expressible in LTL(A). 

Proof. We claim that L fl (Alphinf C C) is the finite union 
[J ( (Max C {6})) • {h~^{v) nMs)^ n (Max C {c}) j • {L{uv) iTMc) 

We will now apply Lemma 0twice in order to conclude the proof of Corollary El 
We fix some u,v G S. First, by Propositions Q and O we know that K 2 = 
L{uv) n Me is recognized by the morphism h (mc and L 2 = h~^{v) fl Ms is 
recognized by the morphism /i(mb • By induction on the size of the alphabet, we 
deduce that L 2 and K 2 are expressible in LTL(A) and LTL(C) respectively. By 
Remark El they are also expressible in LTL (A). 
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Second, note that (Max C {6}) C Z\ U {!}. Hence, 

h~^{u) n (Max C {6}) = fl (Z\ U {!})) H (Max C {6}). 

By Proposition^ the language Li = /i“^(u)n(Z\U{l}) is expressible in LTL(i7). 
Applying once Lemma El we deduce first that 

Ki = (Li n (Max C {b})) ■ L 2 = {h~\u) n (Max C {b})) ■ (/i"^(u) n Mb) 

is expressible in LTL(A). Next, applying a second time LemmaElwe deduce that 
(Ki n (Max C {c})) • K 2 is expressible in LTL(A). Therefore, L fl (Alphinf C C) 
is expressible in LTL(A). 

In order to conclude the proof of Theorem El we use the following proposition. 

Proposition 6. Assume that (A, D) is non-connected and let L QM. be recog- 
nized by h. Then L is expressible in LTL(A). 

Proof. Assume that A = A 1 UA 2 with Ai x A 2 C / and let Ki = Kb, 

and hi = h\Mi for i = 1,2. By Proposition El L is a finite union of products 
Li ■ L 2 where Li C K.j is recognized by hi for i — 1,2. By induction on the 
size of the alphabet, we deduce that Li and L 2 are expressible in LTL(Ai) and 
LTL(A 2 ) respectively. Using Lemmas we deduce that Li ■ L 2 is expressible in 
LTL(A). 



7.1 Proof of Proposition El 

Let us first show that the set {Pb)°° is expressible in LTL(A). 

Lemma 4. The real trace language {Tb)°° is expressed by the formula 

(Min C D{b)) A ^stop V F(&)stop V G P(Min = 6)^ 

where we write simply (Min C D(b)) and (Min = b) instead of the corresponding 
LTL(A) formula. 

Now, recall that h{b){Q) = Q' . Each s G h{{Tb)*) maps the subset Q' to Q' . 
Hence we may define two subsets T, T' C Trans(Q') by T = {sfQ'l s G h{Tb)} 
and T' = {sCq'I s G h{{rb)*)}. Since h{{Tb)*) is a submonoid of S, the set T' is 
a monoid. Moreover, the monoid T' is generated by T and is aperiodic since S 
is aperiodic. 

By T* we denote the free monoid generated by the finite set T (here T 
is viewed as an alphabet). Accordingly, T°° means the set of finite or infinite 
words over the alphabet T. The inclusion T Q T' induces a canonical morphism 
e : T* ^ T' which is called the evaluation. 

The mapping a : Tb ^ T , defined by the restriction <t{x) = h{x)\Q>, induces 
a morphism cr : (Tb)* — >■ T* between free monoids. The mapping a is also 
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extended to infinite sequences a : {rb)‘^ — >■ T^, so finally we have a : {rb)°° — ?> 

Since T' is a submonoid of Trans (Q') and \Q'\ < |Q|, we may use induction 
(Although we might have \T\ > |A|). More precisely, we may assume that every 
language K C T°° which is recognized by the morphism e is expressible in 
LTL(r). The following lemma allows us to make use of this induction step. 

Lemma 5. Let K C T°° he a word language expressible in LTL(T). Then the 
real trace language a~^{K) C (T6)°“ is expressible in LTL(A'). 

Proof. The statement of the lemma corresponds to Lemmas 5 and 9 in 0. The 
proof of Lemma 5 given there can be translated to our situation without any 
difficulty. It is therefore omitted. 



Lemma 6. Let L Q M. be recognized by h. Then L IT b{Tb)°° is expressible in 
LTL (A). 

Proof. We define the language K C T°° with respect to the language L by 
K = {cr(a;) G I G L n h{rh)°°}. 

It was shown in 0 Lem. 60 that Lr\b{Pb)°° = ba ^{K) and that K is recognized 
by the morphism e : T* ^ T' . 

Now, it is easy to conclude the proof. Since \Q'\ < \Q\ and K is recognized by 
e : T* — >■ T' C Trans((505 we know by induction that K is expressible in LTL(T). 
Using Lemma El we deduce that cr~^{K) is expressible in LTL(A). Therefore, 
L n b{Pb)°° = ba~^{K) is expressible in LTL(A). 

We are now ready to complete the proof of Proposition El Let L C K be a 
real trace language recognized by the morphism h. We claim that 

LnA= y {h-^{u)nn){h~^{v)nMni,)){L{uv)nb{rb)^). 

u,v£S 

Indeed, let t = xyz with x G h~^{u), y G h~^{v) and z G L{uv) for some 
u,v G S. Then h{xy) = uv and we deduce from Proposition 0] that t = xyz G 
L. Conversely, let t G {L D A). Using the unambiguous factorization A = 
we can write t = xyz with x G LI, y G and z G b{Pb)°°. If 

we let u = h{x) and v = h{y) then we get z G L(uv) which concludes the proof 
of the claim. 

By Proposition 0 we know that L{uv) is recognized by h, hence by Lemma 0 
the language L 3 — L{uv) T b{Pb)°° C (Min = b) is expressible in LTL(A'). By 
Propositional the languages Li = h~^{u) T Mg and L 2 = h~^{v) T are 
recognized by the restriction of h to Ms . Hence, by induction on the size of the 
alphabet, they are expressible in LTL(H), and also in LTL(A’) by Remark|21 Since 



^ In El we used a slightly different notion of h-similarity, but the proof given there 
applies in fact to the definition as it is used here. 
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77UM/(f,) C Mb, we have /i ^(■u)n7T = LiflTT and /i = L 2 nM/(f,). 

Using Lemma 121 we deduce that 

(Li n n){L2 n Mj(^k))L3 = n n){h~^{v) n n b{rb)°°) 

is expressible in LTL(ii'). 
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Abstract. Interval Temporal Logic (ITL) is a formalism for reasoning about time 
periods. To date no one has proved completeness of a relatively simple ITL de- 
ductive system supporting inhnite time and permitting infinite sequential iteration 
comparable to co-regular expressions. We have developed a complete axiomati- 
zation for such a version of quantified ITL over finite domains and can show 
completeness by representing finite-state automata in ITL and then translating 
ITL formulas into them. Here we limit ourselves to finite time. The full paper 
(and another conference paper [15]) extends the approach to infinite time. 



1 Introduction 

Interval Temporal Logic (ITL) [6,9, 10] is a temporal logic which includes a basic con- 
struct for the sequential composition of two formulas as well as an analog of Kleene 
star. Within ITL, one can express both finite-state automata and regular expressions. Its 
notation makes it suitable for logic-based modular reasoning involving periods of time, 
refinement [2], sequential composition using assumptions and commitments based on 
fixpoints of various temporal operators [12, 14] and for executable specifications [11]. 
Various imperative programming constructs are expressible in ITL and projection be- 
tween time granularities is available (but not considered here). Zhou Chaochen, Hoare 
and Ravn [21] have developed an ITL extension called Duration Calculus for hybrid 
systems. Several researchers have looked at ITL decision procedures and axioms sys- 
tems. However, previously there was no known complete axiom system for a version 
of ITL over both finite and co-words having no artificial restrictions on interval con- 
structs. We present a natural and complete axiomatization for a subset of quantified ITL 

* Part of the research described here has been kindly supported by EPSRC research grant 
GR/K25922. 
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for finite time in which variables are limited to hnite domains. The completeness proof 
describes an ITL decision procedure within ITL itself. In the full paper (and another 
conference paper [15]) infinite time is considered. 

We build on the work of Siefkes [19] who proved the completeness of an axiom- 
atization of the Second-Order Theory of Successor (SIS) and Kesten and Pnueli [7] 
who did likewise for Quantified Propositional Temporal Logic (QPTL) with past-time 
operators. Our approach follows Kesten and Pnueli’s technique of reducing temporal 
formulas to hnite-state automata as part of a decision procedure. These automata are 
themselves represented and manipulated in ITL. The ITL axiom system and complete- 
ness proof vary substantially from Kesten and Pnueli’s. This reflects differences be- 
tween conventional temporal logics and an interval-based one. 

Our results offer a natural yet complete axiom system for a nontrivial subset of ITL 
and show how ITL can itself encode the decision procedure. Automata are more compo- 
sitional than temporal logic tableaux which analyze several formulas in parallel. There 
is no need for Fischer-Ladner closures [4], first developed for a propositional version of 
Pratt’s Dynamic Logic [17]. We also show that the ITL axiom system provides a logical 
framework for both hnite-state automata and regular expressions. 

2 Related Work 

Let us now discuss other work on ITL axiom systems. Rosner and Pnueli [18] inves- 
tigate an axiom system for quantiher-free propositional ITL (PITL) with hnite and co- 
intervals. The ITL subset also includes the until operator but not the operator chop-star 
which is like Kleene-star for regular expressions. A tableaux-based decision procedure 
underlies the completeness proof and uses an adaptation of the Fischer-Ladner closures. 
One inference rule requires detailed meta-reasoning about tableaux transitions. 

Paech [16] investigates PITL with co-intervals but includes a chop- star \\m\ttdL, like 
Kleene-star, to hnitely many iterations and another operator unless. She gives a com- 
plete proof system with some nonconventional axioms for formulas already in a form 
like regular expressions and possibly involving complex meta-reasoning. A generaliza- 
tion of Fischer-Ladner closures is used. 

Dutertre [3] gives two complete proof systems for hrst-order ITL without chop- 
star for hnite time. One has a possible-worlds semantics of time and the other uses 
arbitrary linear orderings of states. Neither is complete for standard discrete-time inter- 
vals. Wang Hanpin and Xu Qiwen [20] generalize this to inhnite time. Moszkowski [12] 
presents propositional and hrst-order ITL axiom systems for hnite intervals. The former 
is claimed to be complete but only an outline of a proof is given. Axioms for tempo- 
ral projection are given in [13]. Bowman and Thompson [1] have recently developed 
a tableaux-based completeness proof for an axiomatization of ITL with projection and 
hnite time. 

3 Overview of Interval Temporal Logic 

We now briehy describe ITL for hnite time. More details are in [6, 8-12, 14]. Basic 
ITL uses discrete, linear time. An interval a has a length |o| > 0 and a hnite, nonempty 
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sequence of |o| + 1 states Go, ■■■ , 0|(j| . A state O; maps a variable such as A to a value 
Oj(A). Lower-case static variables a, b, .. . do not vary over time. 

Here are permitted constructs using variable v, terms t and t' and formulas P and Q: 

Terms: v (for numerical v), 0, 1, 2, . . . (natural numbers), if P then t else t' 

Formulas: v (for boolean v), t = t', Vv. P, ~^P, P hQ, skip, P; Q, P* 

A variable v’s values in an interval range over the finite, nonempty set domain (v) 
which here is either { false, true} or some initial subsequence of the natural numbers. 
Finite data domains ensure we have a decision procedure for our completeness result. 
We can readily extend domain to all ITL constructs. As in several temporal logics, the 
formula / = 2 is true on a iff /’s value in Go equals 2. 

There are three primitive temporal operators: 

skip P\Q {chop) P* {chop-star) , 

where P and Q are themselves formulas. The formula skip is true on a two-state interval. 
A formula P; Q is true on o iff G can be chopped into two subintervals sharing a state 
Gk for some A: < |o| with P true onGo - Gk and Q true on o<; . . . G\^\ . Thus the formula 
skip-,1 = / is true on o iff o has at least two states and / = / is true in Oi. A formula 
P* is true on o iff G can be chopped into zero or more parts with P true on each. Any 
formula P* (including /a/se*) is true on a one-state interval (see §3.2). Figure la picto- 
rially illustrates the semantics of skip, chop, and chop-star. Some simple ITL formulas 
together with intervals which satisfy them are shown in Fig. lb. 







l = l 


/: 1 


2 4 


skip 


• • 


/ = 1 A skip 


• 

/: 1 


2 


P-,Q 


P Q 


skip-,1 = 1 
(07=1) 


/: 2^ \ ^ 2 ^ 
skip 7=1 


P* 


p p P 


true-, I f 1 
(0/^1) 


• 

I: 


• • • • • 

1 1 X. ' J 








true 7 1 






-^(true-,l f 1) 
(□7=1) 


I: 1 


11111 


(a) Informal semantics 


(b) Some examples 



Fig. 1. Informal ITL semantics and examples 



For natural numbers i, j with i <j<\o\, let Gtj denotes the subinterval of length 
j — i (i.e., j — i-\-l states) with starting state O; and final state Gj. Below is the syntax 
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and semantics of the basic ITL constructs used here. We denote the semantics of a term 
t and formula P on interval O as [[t]] and [[P]] . 



3.1 Semantics of Terms 



Numerical static or state variable: 5Wo[H]=^o(v)- 

A numerical variable’s value for an interval a is the value in o’s initial state Oq. 

Numerical constant: Hy [[c]] = c. . . • . ^ _ 

Conditional term: Hy [W P then t else t'l] = i ^ " 

[[?']], otherwise. 



3.2 Semantics of Formulas 

- Boolean static or state variable: = Oo(r). 

A boolean variable’s value for an interval o is the value in o’s initial state Oq. 

- Equality: 5Wa[[^ = t% = true iff 5Wo[[f]] = 5Wo[[t']]- 

- Negation: 5Wo[[^^]] = frwc iff = false . 

- Conjunction: 5Wa[[-P a 2]] = true iff iWoJP]] = 5Wo[[2]] = true. 

- Universal quantification: 5Wo[[Vv.P]] = trMC iff 5W„/ [[P]] = true, 

for every interval o' identical to o except possibly for variable v’s behavior. 

- Unit interval: iWopUp]] = true iff |o| = 1. 

- Chop: 5Wo[[P;2]] = true iff 5Wo/[[P]] = true and 5V4 j«[[2]] = true, 

where o' = Oo:; and o" = 0, |(j| for some i < |o|. Intervals o' and o" share state o,. 

- Chop-star: 5Wo[[F*]] = true iff [[P]] = true, for each i : 0 < i < n, 

for some n > 0 and finite sequence of natural numbers h < h < ■ ■ ■ < h where 
/o = 0 and /„ = |o|. Every one-state interval satisfies P* since we can trivially choose 
n = 0. 



If a formula P is true on an interval o, then o satisfies P, denoted o |= P. A formula 
P satisfied by all intervals is valid, denoted |= P. 

We view formulas as boolean terms to avoid, for example, distinct theorems for 
quantified boolean and numerical variables. Hence P — Q and P=Q are identical. 



3.3 Some Definable Constructs 



Constructs like true, P v Q and 3v. P are definable as are OP (“sometimes P”), DP 
(“always P”) and OP (“next P”): 



.s def 

OP = true\P 



def . 

up = -.o- 



def 

OP = skip,P 



We refer to the quantifier-free ITL subset built from no temporal operators but O and O 
as simple temporal logic. Here are more operators expressible in this: 



®P = -.O^P 

def 

more = Otrue 

def 

empty = —^more 



(Weak next) 
(More states) 
(One state) 



fin P = □ (empty O P) 

. , def , , 

halt P = □ (P = empty) 

def 

@ P = □ (more D P) 



(Pinal state) 
(Just last) 
(Mostly) 
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The conventional temporal operator until, though definahle (with 3), is not needed. A 
version of O for numerical terms is expressible using conditional terms. The formula 
t gets t' is true iff for each pair of adjacent states, the value of term t' at the first state 
(i.e., on the suffix subinterval starting from it) equals term f ’s value at the second state: 

tgetst' ®((Of)=t') . 

Below are operators for examining initial and arbitrary subintervals: 

d6f d6f d6f d6f 

<PP = P',true TIP = P = true\P\true HP = -i<l>^P . 



4 A Proof System 



We now present a proof system for ITL. Our experience with hundreds of proofs has 
helped refine it. There is a quantifier-free part and another for quantifiers. 



4.1 Quantifier-Free Axioms and Inference Rules. 

We use some of Rosner and Pnueli’s axioms for chop [18] and ours for Tl and chop- 
star [12]. Let w be a state formula, i.e., without temporal operators. 



Basic h Substitution instances of all valid 
quantifier-free state formulas. 

P2 h {P;Q);R = P;{Q;R) 

P3 h (PvP');2 D (P;2) v(P';2) 

P4 ^ PfQvQ') Z) {P-,Q)v{P-Q') 

P5 h empty; P = P 

P6 h P; empty = P 

P7 h w D Glw 

MP hPDe, hP ^ he 
□Genh P ^ h DP 



P8 


h 


W D Ow , 




where variables in w are static. 


P9 


h 


m(PDP') AD(2De') 






3 (T;e)3(T';e') 


PIO 


h 


OP D ®P 


Pll 


h 


Pa Ojp D ®P) D op 


P12 


h 


P* = empty v (P A more);P* 


mCenh 


p ^ h mp 



These axioms and inference rules do not have quantifiers but w, P, etc. can. In Axiom 
Basic, term t substitutes into variable v only if domain {t) C domain{v). Axiom Pll 
enables induction over time. 

A formula P deduced from the axiom system is called an ITL theorem, denoted h P. 
Below are a few theorems. The full paper has more with some proofs. 



Tl h n(PD e) D UPzyUQ 
T2 h 0{PZZQ) D OPdOQ 
T3 h O empty 



T4 h {waP);Q 
|_ 

T6 h skip* 



w A (P;Q) 
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4.2 Axioms and Inference Rules for Quantifiers 

In the axioms and inference rules for quantifiers, v is an arbitrary variable: 

Q1 h Vv.P D Pi , 

where v is free for f in P and domain (t) C domain (v). (The full paper describes 
substitution into temporal contexts.) 

Q2 h 'iv.{P Z) Q) D (P ^ Vv. Q ) , where v does not occur freely in P. 

Q3 h 3v. (P;2) = (Bv./*); 2, where V does not occur freely in 2- 

Q4 h 3v. (P; Q) = P\ (3v. Q) , where v does not occur freely in P. 

Q5 h (3v.P);0(3v. 2) D 3v. (P; 02), where v is a state variable. 

VGen \- P =► h Vv. P, for any variable v. 

The next theorem expresses chop-star using a fresh boolean variable B\ 

T7 h P* = 3B.{B^M{B O <P{P rxOhaltB))^ . 

Here is one to construct a hidden state variable v always equaling t: 

T8 h 3v.D(v = t) , 

where domain (t) C domain (v) and v does not occur freely in t. 

The one below creates a hidden state variable v which is initialized and then incre- 
mentally assigned a term which can depend on v’s current value: 

T9 h 3v. (v = f A V gets t') , 

where domain [t) C domain (v) and domain (t') C domain (v). Also v does 
not occur freely in t or within the temporal operators in f'. 

Thus, the boolean variable B below initially equals/aZxe and always flips: 
h 3B. (B = false a B gets ^B) . 

One can easily show that the axiom system is sound, that is, h P implies \= P. Our 
main goal is conversely to establish completeness, that is, |= P implies h P: 

Theorem 4.1 (Completeness) Any valid ITL formula is also a theorem. 



5 Overview of the Proof of Completeness 

The basic completeness proof assumes formulas contain no static variables: 

Lemma 5.1 (Relative completeness for static variables) If all valid formulas without 
static variables are theorems, so are those with them. 

Proof (Outline) Let P be a valid formula and let P' be Vmi . . . \/u„.P, where mi, . . . , 
u„ are the free static variables in P. Now P' is also valid and provably implies P (i.e., 
h P' D P). For each static variable u in P', replace any subformula \/u.Q using the 
theorem h Vm. Q = t\cedomam{u) Qu - The new formula P" is valid and provably equivalent 
to P' (i.e., h P' = P"). By our assumption, P" is a theorem so we can deduce P. □ 
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The proof of Theorem 4. 1 reduces formulas to equivalent ones in a normal form: 

Lemma 5.2 (Normal form) For each formula we can deduce an equivalent one having 
no new variables and in which each equality is of the form v = c, where v is numerical 
and c C domain (v). If the original formula is the simple temporal logic defined in §3.3, 
so is the normalized one. 

Lemma 5.3 (Completeness for simple temporal logic) Any valid formula in the sim- 
ple temporal logic subset is a theorem. 

Proof (Outline) We start with a complete axiom system for conventional linear- time 
temporal logic easily altered for finite intervals and finite domains. All of the axioms 
and inference rules are provable from our ITL axiom system. They are Axioms Ba- 
sic, P8, PIO and Pll, Inference Rules MP and DGen and Theorems Tl, T2 and T3. 
Therefore, ITL theoremhood of any valid formula in the subset can be deduced. □ 



5.1 Automata 

We now adapt the approach of Kesten and Pnueli [7] and utilize a variant of finite-state 
automata called here chop-automata which selectively accept intervals. All normalized 
ITL formulas can be built from a few constructs, each having a translation into an 
automaton. In effect, we embed a decision procedure for ITL in ITL itself and use the 
logic to express the procedure’s correctness. This helps to show completeness. 

The behavior of an automaton JT can be expressed in ITL as the formula denoted 
(defined later) which is true on an interval iff FI accepts that interval. We then construct 
from a formula P an automaton Ff accepting the intervals satisfying P. The formula 
represents JT^’s accepting runs and is provably equivalent to P in the axiom system 
(i.e., P P = yj^ ). The shorter form jfi is generally used. We can also show for any 
automata FI accepting no intervals, the formula is provably false (i.e., I — 

To prove that a valid formula P is a theorem, we construct from ^P an automaton 
FC^ with I — P = . Now P is valid, so FC^ accepts nothing and we deduce I — ^X^^. 

These together yield h P. 

We now describe chop-automata for recognizing finite intervals. 

Definition 5.4 (Chop-automaton) A (nondeterministic) chop-automaton FI is a quin- 
tuple (V, K, qo, 5, f)for which 

— V is a possibly empty finite set of boolean and numerical state variables, 

— K is a nonempty finite set o/ automaton states, 

— qo & K is the initial state, 

— 5 is the transition function mapping K x K to quantifier- free state formulas over 
variables in V, 

— X is the termination function mapping K to quantifier-free state formulas over vari- 
ables in V. 



It is necessary to introduce the notion of a run of a chop-automaton on an interval: 
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Definition 5.5 (Run and accepting run of chop-automaton) A run of a chop-automata 
A over an interval O is any finite sequence p of |o| + 1 elements po, ■ ■ ■ , P|a| G K in 
which for each two adjacent automaton states p,- and p,+i the interval state O,- satisfies 
the transition formula 5(p,',p,+i), (i.e., O; ^ 5(P!,P 1+1 )l 

A run p is called an accepting run of the chop-automaton 3^ over the interval o if 
the run’s initial state po is qo and in addition c’s final state 0 |c 5 | satisfies the termination 
condition selected by the run’s final automaton state P|c 5 |, namely x(p|(j|). 

We say that A accepts an interval o if there is at least one accepting run over o. 

In contrast to conventional finite-state automata, a chop-automaton uses X to test the 
very end of an interval without advancing to permit the operator chop to be represented. 

An automaton Jl’s accepting runs are expressible in ITL. Let F be a numerical state 
variable not in V and with K C domain (F). Define the formula acc_r^{Y) as follows: 

acc_r^{Y) '= Y = qo a S5{Y,OY) Afinx{Y) . 

The formula now defined expresses the existence of some accepting run: 

X^ = 3Y.acc_r^{Y) . 

5.2 Automata Constructions 

Given some normalized formula P (see Lemma 5.2 presented earlier), we construct an 
automaton We sometimes denote the individual parts of as V^, K^, etc. and 
abbreviate acc_r^ (F)andX^ as acc_r^(F) and respectively. 

Here is a list of formulas which we need to consider: w (quantifier-free state for- 
mula), Pv Q, ~^P, 3v. P, skip, and P\ Q. These constructs are ones most readily translated 
to automata. We replace chop-star formulas using ITL Theorem T7 in §4.2 to avoid the 
need to also directly reduce them to automata. 

During the construction of automata, various operations can be performed such as 
renaming of an automaton’s states or determinizing it. These are expressible as ITL 
theorems. The details are omitted here. We also have the following lemma: 

Lemma 5.6 If Pi has no accepting runs, then I — 

Proof Suppose that PL has no accepting runs. The following formula is valid and hence a 
theorem by Lemma 5.3: 1 — <acc_r^{Y). We introduce an existential quantifier to deduce 
the theorem I — ^3F. acc_r^{Y) which reduces to I — <X'^- □ 

Below are constructions for w, skip and chop. The full paper also looks at others. 



Automata for Quantifier-Free State Formulas For a quantifier-free state formula w, 
the automaton H'” has V equal the set of w’s variables, K — {0, 1}, <70 = 0 with 5 and x 
as follows: 



5{0,0): false 5(0,1): w x(0):w x{\): true 

5{l,0):false true 
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Claim 5.7 The automaton accepts an interval O iffc satisfies w. 

Lemma 5.8 The following equivalence is an ITL theorem: \- w = 

Proof Let A be a numerical state variable with domain {0, 1} and not occurring in w. 
The following valid simple temporal formula is a theorem by Lemma 5.3: 

h w aX = 0 aX gets 1 D acc_r'^(X) . (5.1) 

The variable X can now be existentially created using ITL Theorem T9: 

h 3X.(X = 0AXgetsl) . (5.2) 

The two theorems (5.1) and (5.2) are combined to obtain the following: 

h w D 3X.acc_r'^(X) . (5.3) 

For the converse, the valid formula acc_r“’(A) D w is a theorem by Lemma 5.3. We 
then deduce h 3X.acc_r'*(X) D w which with (5.3) yields h tv = X'*- □ 

Automaton for skip Below is an automaton accepting two-state intervals: 

y = U, K = {o,i}, qo = o, 

5{0,0): false 5{0,1): true x{0): false t{ 1): true 
false 5(l,l):/a/se 

Claim 5.9 The automaton accepts an interval O iff <5 satisfies skip. 

Lemma 5.10 The following equivalence is provably true: h skip = X^^‘^- 

Proof We first look at deducing skip D Let X have domain {0, 1 }. The next valid 
formula is a theorem by Lemma 5.3: 

h skip A ^{X = if more then 0 else \) D acc_P^‘^(X) . (5.4) 

A hidden instance of X is now created with ITL Theorem T8 in §4.2: 

h BX.O(X = if more then 0 else 1) . (5.5) 

We then combine the two theorems (5.4) and (5.5): 

h skip D 3X . acc_r'^‘^ (X) . (5.6) 

For the converse, the valid formula below is a theorem by Lemma 5.3: 

F acc_r‘^‘’^ {X) D skip . 

The variable X is existentially hidden to deduce the following: 

h 3X.acc_P^‘'’{X) D skip . (5.7) 

We reach the goal by combining (5.6) and (5.7): h skip = □ 
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Automata for chop Let us now construct an automaton for the formula P\ Q and 
deduce the equivalence of P; Q and Assume by induction that and are P’s 
and Q’s respective automata with disjoint and K^. Here is a suitable 

y = y^uy2, k = k^\jkQ, qo = go, 

5{q,q'): x{q) : 

for^.^'eA"^ T^(^) A T^(^o ), for^GP"^ 

6^{q,q'), for q,q'GRG x2(^) for ^ G 

%^{q) f\?fi{qQ ,q'), for q & K^,q' & 
false, otherwise 

Claim 5.11 The automaton accepts an interval G iffG satisfies P; Q. 

Lemma 5.12 Formulas P; Q and are provably equivalent: \~ P,Q = 

Proof We assume by induction h P = X^ and h Q = X® and then deduce the following: 

hP;e = X^;X® (5.8) 

To show that the valid formula X^;X® = X^’® is a theorem, we re-express it: 

(3A.acc_r^(A)); (3F. flcc_r®(F)) = 3Z.acc_r^’^{Z) . 

Here X, Y and Z share a domain which is a superset of K^, and K. The left subfor- 
mula’s quantifiers can be moved out of the chop operator: 

h (3Z.acc_r^(A)); (3T.acc_r^(T)) = 3Z,T. (acc_r^(Z);acc_r^(T)) . (5.9) 

We now turn to the formula acc_r^{X)',acc_r^{Y). This is provably equivalent to the 
formula 3B.(^{B,X,Y), where P is a new boolean state variable and the subformula 
(^{B,X,Y) is as now defined: 

(|)(P,Z,T) = X = q^AB 

A ®(P D 5^(A,OZ)) 

A D 5e(F,OF) aO^P) 

aD(Pa®^P D X^{X) aY = q^ A{emptyv5Q{Y,OY))) 
AfinX^{Y) . 

We represent the automata’s behavior using (|) because it is in simple temporal logic. 
The purpose of P is to indicate at each interval state which of the chop construct’s 
two subintervals contains the state. When P is true, the state is enclosed in the left 
subinterval and automaton 31^ is active. When P is false the state is within the right one 
and 3^2 active. In the case of the single state shared by both intervals, P remains true 
as in the left subinterval. If the right subinterval has only one state then P is always true. 
The relationship between (|) and acc_r^{X)',acc_r^{Y) is now expressed: 

h 3B.(^{B,X,Y) = acc_r^{X);acc_r^(Y) . (5.10) 
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In order to prove this, we first deduce the next theorem: 

h (|)(5,X,F) = {acc^iX) AaB);{acc_r^{Y) aB A®a^B) . 

Its proof uses temporal fixpoints and derived inference rules for compositionality. The 
details are omitted here. Lemma 5.3 yields the theoremhood of some valid formulas 
involving (|) such as the next one for creating Z from B, X and Y : 

h (^{B,X,Y)Aa{Z=ifBthenXelseY) D acc_r^’^{Z) . 

We combine this with ITL Theorem T8 in §4.2 and hide B and Z: 

h 3B.(^{B,X,Y) D az.acc_/’2(Z) . 

From this and ITL Theorems (5.9) and (5.10) we achieve half of the goal: 

h (3X.acc_r^{X));(3Y.acc_r^(Y)) D 3Z.acc_r^’^{Z) . (5.11) 

For the converse of (5.11), we first deduce the next valid formula using Lemma 5.3: 

h acc_r^’^{Z) A □ (B = (Z G A □ (A = Z) 

aY = A Y gets (if {OB) then Y else OZ) 

We then existentially hide B, X and Y using ITL Theorems T8 and T9: 

h acc_ _/’2(Z) D 3B,X,Y.(Sf(B,X,Y) . 

This, (5.9) and (5.10) lead to our desired equivalence’s other direction: 

h 3Z.acc_r^’^(Z) D (3A .acc_ y(X));(3Y. acc_rS(Y)) . (5.12) 

From ITL Theorems (5.11) and (5.12) and X’s definition, we get h X^;X^ = X^’^- This 
and theorem (5.8) yield our main goal: h P;Q = XF'^. □ 

6 Discussion 

The version of ITL here uses numerical variables. Alternatively, we can restrict all vari- 
ables to being boolean and encode numbers as Kesten and Pnueli do. 

In [13] we look at a compositional axiom system for temporal projection over finite 
intervals which is claimed to be complete. We would also like support for co-intervals. 

Hale [5] first studied /ram/ng in ITL. At present, if a state variable does not change 
value, this must be explicitly specified. Framing makes this implicit and shortens spec- 
ifications. A complete axiom system for framing would be helpful. 
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Abstract. In 1991 Feige, Goldwasser, Lovasz, Safra and Szegedy found 
a connection between the approximability of Max-Clique and the area of 
multiprover interactive proofs. What first seemed like an isolated result 
have developed into an entire area of research. 

The connection between interactive proofs and in particular the variant 
called probabilistically checkable proofs or PCPs and inapproximability 
results for many NP-hard optimization problems has proved to be funda- 
mental. For some optimization problems, like clique and many constraint 
satisfaction problems, the parameter giving the degree of inapproxima- 
bility corresponds to a natural parameter in the PCP. Other times, like 
for colorability and the traveling salesman problem the correspondence 
is not as direct, but the best results are still derived from PCPs. 

The basic qualitative result on the existence of efficient PCPs was given 
in the famous paper by Arora, Lund, Motwani, Sudan and Szegedy in 
1992 but since then the quantitative results have improved considerably. 
For many problems we now have almost tight results while for some 
others our knowledge is less complete. 

The goal of this talk is to give an understanding of what has happened, 
outline a few of the results and give at least a taste of some ingredients 
of the proofs involved. 
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Abstract. We show that satisfiability of formulas in fc-CNF can be de- 
cided deterministically in time close to {2k/(k -|- 1))"', where n is the 
number of variables in the input formula. This is the best known worst- 
case upper bound for deterministic fe-SAT algorithms. Our algorithm 
can be viewed as a derandomized version of Schoning’s probabilistic al- 
gorithm presented in 1151 . The key point of our algorithm is the use of 
covering codes together with local search. Gompared to other “weakly 
exponential” algorithms, our algorithm is technically quite simple. 

We also show how to improve the bound above by moderate technical 
effort. For 3-SAT the improved bound is 1.481". 



1 Introduction 

Worst-Case Upper Bounds for SAT. The satisfiability problem foyDropositional 
formulas (SAT) can be decided by an obvious algorithm in timqj 2", where n 
is the number of variables in the input formula. This worst-case upper bound 
can be decreased for /c-SAT, i.e., if we restrict inputs to formulas in conjunctive 
normal form with at most k literals per clause (fc-CNF). First upper bounds c", 
where c < 2, were obtained in 1980s f4l 1 1 )j . for example, the bound 1.619" for 
3-SAT. Currently, much research in SAT algorithms is aimed at decreasing the 
base in exponential upper bounds, e.g., |l6ll4l8ll3lbll2lVIT51 . 

The best known bound for probabilistic 3-SAT algorithms is (4/3)" due to 
U. Schoning m- For probabilistic /c-SAT algorithms when k > 4, the best known 
bound is due to R. Paturi, P. Pudlak, M. E. Saks, and F. Zane This bound 

* Work done in part while visiting Department of Computer Science of University of 
Manchester. Supported by grants from TFR, INTAS, and RFBR. 

** Supported by INTAS project 96-0760 and RFBR grant 99-01-00113. 

^ In this paper, we write all complexity bounds up to a polynomial in the size of the 
input formula. For example, we write 2" instead of poly{n)2". 
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is not represented in compact form; the bound for 4-SAT is 1.477", the bound 
for 5-SAT is 1.569". 

The best known bounds for deterministic fc-SAT algorithms are as follows. 
For fc = 3, O. Kullmann j7] gives the bound 1.505". In the bound 1.497" 
was announced (0 sketched how this bound can be obtained by a refinement 
of 0 )- For k = 4:, the best known bound is still 1.840" due to B. Monien and 
E. Speckenmeyer ^H|. For fc > 5, R. Paturi, P. Pudlak and F. Zane give the 
bound approaching for large k. In this paper we improve all these 

bounds for deterministic algorithms. 

Probabilistic Algorithm. Our algorithm is inspired by the currently fastest proba- 
bilistic 3-SAT algorithm of HS| which uses a well-known heuristic search method, 
cf. [ 211 ) . This algorithm is based on random walks (like Papadimitriou’s algo- 
rithm m for 2-SAT). Given a formula F in fc-CNF with n variables, the algo- 
rithm chooses exponentially many initial assignments at random and runs local 
search for each of them. Namely, if the assignment does not satisfy F, then the 
algorithm chooses any unsatisfied clause, chooses a literal from this clause at 
random, and flips its value. If a satisfying assignment is not found in 3n such 
steps, the algorithm starts focal search from another random initial assignment. 

Thus, the probabilistic algorithm of m includes two randomized compo- 
nents: (1) the choice of initial assignments, and (2) local search starting from 
these assignments. 

Deterministic Algorithm. The deterministic /c-SAT algorithm presented in this 
paper can be viewed as a derandomized version of the probabilistic algorithm 
above. The derandomization consists of two parts. To derandomize the choice of 
initial assignments, we cover the space of all possible 2" assignments by balls of 
some Hamming radius r. We use coding theory to find a good covering for given 
r. In each ball, we run a deterministic version of focal search to check whether 
there is a satisfying assignment inside the ball. 

The optimal value of r can be chosen so that the overall running time is 
minimal. Taking r = l/{k + 1), we obtain the running time (2k/ {k -|- 1) -I- e)". 
We also show how to decrease this bound by using a more complicated version 
of local search. For 3-SAT, the modified algorithm gives the bound 1.481". 

Notation. We consider propositional formulas in fc-CNF (A: > 3 is a constant). 
These formulas are conjunctions of clauses of size at most k. A clause is a 
disjunction of literals. A literal is a propositional variable or its negation. The size 
of a clause is the number of its literals. An assignment maps the propositional 
variables to the truth values 0, 1, where 0 denotes false and 1 denotes true. 
A trivial formula is the empty formula (which is always true) or a formula 
containing the empty clause (which is always false) . 

For an assignment a and a literal I, we write a|i=i to denote the assignment 
obtained from a by setting the value of Z to 1 (more precisely, we set the value 
of the variable corresponding to 1). We also write F|/=i to denote the formula 
obtained from F by assigning the value 1 to I, i.e., the clauses containing the 
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literal I itself are deleted from F, and the literal I is deleted from the other 
clauses. 

We identify assignments with binary words. The Hamming distance between 
two assignments is the number of positions in which these two assignments differ. 
The ball of radius r around an assignment a is the set of all assignments whose 
Hamming distance to a is at most r. 

A code of length n is a subset of {0, 1}”. The covering radius r of a code C 
is defined by 

r= max min d(u,v), 
uG{0,l}" «gC 

where d{u,v) denotes the Hamming distance between u and v. The normalized 
covering radius p is defined by p = r/n. 

Organization of the Paper. Section El describes the local search procedure. In 
Sect. 01 we specify the codes used in our algorithms. Section 0 contains the main 
algorithm and its analysis. We describe the technique yielding to the bound 
1.481" in Sect.0 We discuss further developments in Sect. El 

2 Local Search 

Suppose we are given an initial assignment a € {0, 1}" where n is the number of 
variables. Consider the ball of radius r around a. Obviously, the volume of this 
ball is 

V{n,r) = ^(^\ 

i=0 ^ '' 

If the normalized radius p = r/n satisfies 0 < p < 1/2, the volume V{n,r) can 
be estimated as follows 0 page 121] or 0 Lemma 2.4.4, page 33]: 

^ < V{n,r)< (1) 

^/8np{l - p) 

where h{p) = — plog 2 p— (1 — p) log 2 (l — p) is the binary entropy function. These 
bounds show that for a constant p, the volume V (n, r) differs from at 

most by a polynomial factor. 

The efficiency of our algorithm relies on the following important observation. 
Suppose we need to find a satisfying assignment inside the ball of radius r around 
a (if one exists). Then it is not necessary to search through all assignments inside 
this ball. The given formula F can be used to prune the search tree. The following 
easy lemma captures this observation. 

Lemma 1. Let F be a formula and a be an assignment such that F is false 
under a. Let C be an arbitrary clause in F that is false under a. Then F has 
a satisfying assignment belonging to the ball of radius r around a iff there is a 
literal I in C such that F\i^i has a satisfying assignment belonging to the ball of 
radius r — 1 around a. 
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Proof. For any clause C false under a, we have the following equivalence. The 
formula F has a satisfying assignment inside the ball of radius r around a iff 
there are a literal I in C and an assignment b such that I has the value 1 in 6 
and the Hamming distance between b and a|;=i is at most r — 1. The latter holds 
iff there is a literal I in C such that F^i^i has a satisfying assignment inside the 
ball of radius r — 1 around a. □ 

The recursive procedure Search{F,a,r) described below is the local search 
component of our algorithm. The procedure takes a formula F, an initial assign- 
ment a, and a radius r as input. It returns true if F has a satisfying assignment 
inside the ball of radius r around a. Otherwise, the procedure returns false. 

Procedure Search{F, a,r) 

1. If all clauses of F are true under a then return true. 

2. If r < 0 then return false. 

3. If F contains the empty clause then return false. 

4. Pick (according to some deterministic rule) a clause C false under a. Branch 

on this clause C, i.e., for each literal Z in C do the following: If 

Search(F\i^i, a,r — 1) returns false then return true, otherwise return false. 

The correctness of the procedure easily follows by induction on r with the 
invariant: Search{F,a,r) outputs true iff F has a satisfying assignment inside 
the ball of radius r around a. For the induction step the lemma above is used. 

The recursion depth of Search{F,a,r) is at most r. If the input formula F 
is in fc-CNF, the number of recursive calls generated at Step 4 is bounded by 
k. Therefore the recursion tree has at most fc’" leaves and the complexity of the 
procedure is bounded by . 

Observe here that fc’’ can be much smaller than the volume of the ball of 
radius r around a. For example, for r = n/2 and fc = 3 we obtain V (n, r) > 2"“^ 
whereas 3’' < 1.733". This gives us a very simple deterministic SAT algo- 
rithm: Given an input formula F with n variables, run Search(F, ao,n/2) and 
Search{F, oi, n/2), where oq and ai are n-bit assignments oq = (0,0,..., 0) and 
oi = (1,1,...,!). Since any assignment has the Hamming distance < n/2 to 
either oq or oi, the algorithm is correct. Its running time is bounded by 1.733". 

The set {uq, oi} of assignments in the example above is nothing else than a 
very special covering code. In the next section we show how this example can be 
generalized and improved. 

3 Good Coverings 

By a covering of radius r for {0, 1}" we mean a code C of length n such that any 
word in {0, 1}" belongs to at least one ball of radius r around a code word of C. 

For any covering C of radius r, we have \C\ ■ V(n,r) > 2", where V{n,r) is 
the volume of a ball of radius r. Using the upper bound on V(n,r) in (^, we 
obtain 

^ 2 " ^ 2 " _ 

“ V{n,r) ~ ~ 



2{l-h(p))n 
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where p = r/n is the normalized covering radius. Here the second inequality 
holds when 0 < p < 1/2. This lower bound on \C\ is known as the sphere 
covering bound. The following lemma from coding theory implies the existence 
of coverings whose cardinalities “almost” achieve the sphere covering bound. 

Lemma 2 (jH) Theorem 12.1.2, page 320]). For any n and r, there exists 
a code C of length n and covering radius r with 

|C| < [n2"ln2/F(n,r)l. (2) 



Corollary 1. Let <5 > 0 and 0 < p < 1/2. There exist an integer v = v(5) and 
a code F of length v with the following properties: 

1. The code F has the covering radius r < pv. 

2. The code F achieves the sphere covering bound up to the “+5v” in the ex- 
ponent, namely: iTj < 

Proof. We obtain the required bound on the cardinality by combining and 
the lower bound on the volume in O. □ 

The lemma and corollary above do not provide us with an efficient construc- 
tion of good coverings. To construct them efficiently, we use the concept of the 
direct sum of two codes. Let Ci and C 2 be two codes of lengths ni,U 2 and cover- 
ing radii ri , r 2 respectively. The direct sum of C\ and C 2 is the code C of length 
ni-\-n 2 that consists of all \Ci | • IC 2 I possible concatenations W\W 2 , where w\ € Ci 
and W 2 € C 2 . Obviously, the covering radius of C is ri -h r 2 . 

To generate a good covering C for {0, 1}”, we fix e > (5 > 0 and 0 < p < 1/2. 
Let T be a code of length v satisfying Corollary □ The following algorithm takes 
a (large) length n as input and uses F to generate the code C of length n: 

Algorithm for Generating Good Coverings 

1. Find the smallest integer q such that n < qv. 

2. Generate the direct sum T^, i.e., generate words W\W2 . . .Wq one by one, 
where Wi ranges over F. If n is not divisible by v, we restrict the code words 
of CJ to the first n positions. 

The resulting code is the required covering C. The algorithm runs in time poly- 
nomial in \C\. If n (and q) is sufficiently large, its covering radius is at most 
qvp < \n{p -I- s)J and the cardinality \C\ is bounded by: 

2 {l-h{p)+S)qv ^ 2(l-^(p)+e)" 



Note that the sphere covering bound is “almost” (up to the “-|-£n” in the covering 
radius and in the exponent) achieved. Note also that the code F provided by 
Corollary [n is “hard wired” into the algorithm. 
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4 Main Algorithm and Its Analysis 

Given fixed e > 0 and 0 < p < 1/2, our satisfiability algorithm takes as input a 
formula F with n variables and works as follows. 



Main Algorithm 

1. Generate a covering C for {0,1}" by using the algorithm for generating good 
coverings in Sect. 0 

2. Galculate r = \ n{p + e)J . 

3. For each code word a in C run the procedure Search{F,a,r). Return true if 
at least one procedure call returns true. Otherwise return false. 

If inputs are formulas in fc-GNF, the complexity of the algorithm is bounded 
by jCI-fc". Since \C\ < and fc" < fc"'’ up to a factor of c^", the complexity 

is bounded by 

2 (l-^(p))n . pnp ^ ^ {2pP{l-pf-PkT- 



The choice p = 1/(A: + 1) minimizes (calculus required) the base value of this 
exponential function. Substituting l/(fc + 1) for p, we have 



1 

k + I 



i/(fe+i) 



1 - 



1 

fc + 1 



i-i/(fc+i) 






2k 

k + 1 



Thus, we obtain the following theorem. 



Theorem 1. For every e > 0 there is a deterministic algorithm for deciding 
satisfiability of propositional formulas in k-CNF whose running time is 



2k 

k + 1 




n 



where n is the number of variables in the input formula. 



5 Improved Local Search 

The local search procedure Search{F, a, r) from Sect. Spicks any false clause for 
branching. The complexity of this procedure is at most kT . Ghoosing a clause for 
branching more carefully, we can improve this bound and, thereby, the overall 
running time of our main algorithm. In this section we show how to improve 
Search{F, a, r) for formulas in 3-GNF so that its complexity is bounded by 2.848" 
instead of 3". We then obtain the bound 1.481" for 3-SAT. 

Let F be formula and a an assignment. Let C be a clause in F. We classify 
C according to the number of its literals true under a. Namely, we call C a 

(1, . . . , 1, 0, . . . , 0)-clause 

p m 
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if C consists of exactly p literals true under a, and exactly m literals false under 
a. For example, a (1, 1, 0)-clause consists of two true and one false literals. 

We say that F is i-false under a (i > 0) iff F has no satisfying assignment 
within Hamming distance i from a. When i is fixed, we can test in polynomial 
time whether F is j-false under a, where F and a are given as input. 

The following observation follows from the fact that a (l)-clause k of F 
becomes the empty clause in F|;_i. 

Lemma 3. Let F be a formula false under a. If V ^2 V I 3 is a (0, 0, 0)-elause 
and li is a {l)-elause of F then we have the following equivalence: F is satisfied 
within Hamming distance < r from a iff there is a j ^ i such that F^.^i is 
satisfied within Hamming distance < r — 1 from a. 




Fig. 1. Branching I, II, Ill-constrained formnlas. 



Let F be a non-trivial formula in 3-CNF. The following definitions describe 
types of formulas used in the new version of the procedure Search. Figure 0 
illustrates these types. 

1. Let F be false under a. We say that F is I-constrained with respect to a iff 
one of the following holds: 

— F has a (0)- or (0, 0)-clause with respect to a. 

— F has a (0, 0, 0)-clause containing a literal I such that I is a (l)-clause 
of F. 

2. Let F be 1-false under a. We say that F is II- constrained with respect to a 
iff one of the following holds: 

— F is I-constrained with respect to a. 

— F has a (0, 0, 0)-clause containing a literal I such that the formula F|;^i 
is I-constrained. 

3. Let F be 2-false under a. We say that F is III- constrained with respect to a 
iff one of the following holds: 

— F is I-constrained or Il-constrained with respect to a. 

— F has a (0, 0, 0)-clause such that for each literal I in this clause, F/=i is 
Il-constrained. 

The next lemma and its corollary show that there is no need to make recursive 
calls for not Ill-constrained formulas, because such formulas turn out to be 
satisfiable. 
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Lemma 4. Let F he non-trivial and 3-false under a. Let I he a literal false 
under a. Let F\i^i still he non-trivial (it is 2-false under a hy assumption) . Then 
we have the implication: is LLL-constrained with respect to a then -F|/=i is 

IFconstrained or F itself is LLL-constrained. 



Proof. First, a preparatory remark: If F is non-trivial, 1-false under a and not 

I- constrained then is non-trivial for any literal I belonging to a clause of 

F false under a. This is because the empty clause could only be generated if F 
contained the clause I, which is not the case as f is not I-constrained under a. 

We show that if F^^i is Ill-constrained with respect to a then F^i-i is II- 
constrained or F is Ill-constrained. 

By assumption F’|/=i is non-trivial and 2-false under a. If is I- or II- 

constrained, the lemma holds. The remaining case to consider is that has 

a (0, 0, 0)-clause V I 2 V I 3 such that each is Il-constrained. If an 

is I-constrained then F'|;=i is Il-constrained and the lemma holds. We 
have to consider the situation that in each we have a (0, 0, 0)-clause 

ki V ki^i V ki ^2 such that (without loss of generality) setting ki = 1 shows that 
is Il-constrained. This means that Gi = is I-constrained 

for z = 1, 2, 3. As F is 3-false under a, we obtain that Gi is false under a. 

It follows from the remark in the beginning of our proof that Gi is non-trivial. 

The final idea is as follows. A clause that causes Gi to be I-constrained is 
present in F or generated in I or 2 (but not 3 because of F is in 3-CNF) of the 
steps I = l,li = l,ki = I. If the clause is present in F we are done. If the step 
I = 1 is among the steps making at least one Gi I-constrained, is already 

II- constrained. If for no Gi the step I = 1 is necessary to make Gi I-constrained, 
the clause V ^2 V I 3 and other clauses already present in F cause each Gi to be 
I-constrained. This means that F itself is Ill-constrained. The missing details of 
this argument follow below. 

We consider several cases depending on the types of clauses that make Gi 
I-constrained: 

Case of (0, 0)-clause. If Gi has a (0, 0)-clause hi V /12 then hi V /12 is gen- 
erated when setting ki = 1. This is because we assume that F^i^i i.^i is not 
I-constrained. Therefore has a (1, 0, 0)-clause fc* V /zi V ft -2 and the 

(0, 0, 0)-clause ki V h^i V ki ^2 where /ii,/i 2 ^ h. As F is in 3-CNF these clauses 
are also present in F and we obtain that F is already Il-constrained. 

Case of 0-clause. If Gi has a 0-clause hi then hi is generated when setting 
ki = 1. Again this is because we assume that F^^i i.^i is not I-constrained. 
Therefore F^i^ii.^i has a (l,0)-clause kiMhi and the (0, 0, 0)-clause hV kill'd ki ^2 
where hi yf ki. The clause ki V ki^i V ki ^2 is present in F and Fi^i. If ki V hi is 
also present in ^ 1 ;=! then F'|/=i is Il-constrained and we are done. 

We have to assume that ki V hi is not present in F’|j=i. Then F’|/=i has the 
(0,0,0)-clauses 



V ^2 V I 3 
h V ki^i V ki^2 
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and the (l,l,0)-clause kW hW hi. Since F is in 3-CNF, these clauses also belong 
to F. 

Case of (0, 0, 0)-clause and 1-clause. Let h[ V h'( V h"' be a (0, 0, 0)-clause 
with respect to a in Let be a 1-clause in Gi (without loss of generality). 
If has the 1-clause h' then is Il-constrained and we are done. 

So we assume that does not have the clause h'-. Then F^i^i i.^i 

must have the clause ki V h'. If this clause is also present in F^i^i then F^^i is 
Il-constrained and we are done. 

We still need to assume that the clause ki V /i' is not in F\i^i. Then F\i^i 
has the (0,0,0)-clauses 



V ^2 V I 3 
ki V ki^i V ki ^2 
h' V h" V hT 

and the (l,l,l)-clause kW kiV h'-. Since F is in 3-CNF, these clauses are also 
present in F . 

If we cannot conclude that F or F'|;=i is Il-constrained by now, we obtain 
that for each i = 1, 2, 3, one of the two remaining cases above applies. Then the 
(0, 0, 0)-clauses V Z 2 V I 3 , ki V ki ^2 V for i = 1, 2, 3, and the other clauses as 
specified above are present in F. These clauses show that F is Ill-constrained 
because when we branch on the clause Zi V Z 2 V I 3 instead of considering 
we can see that each F\[.^i is Il-constrained. □ 

Corollary 2. Let F be non-trivial and 2-false with respeet to a. If F is not 
lU-eonstrained with respeet to a then F is satisfiable. 

Proof. If F is not 3-false under a then F is satisfiable and the corollary is proved. 
So assume F is 3-false and has clauses false under a. The only such clauses are 
(0, 0, 0)-clauses because F is not Ill-constrained and, therefore, not I-constrained 
with respect to a. Let Zi V Z 2 V I 3 be a (0, 0, 0)-clause with respect to a in F. Then 
each is non-trivial and 2-false (cf. the remark in the beginning of the 

proof of Lemma If each F^.^i is Ill-constrained, we obtain from Lemma E] 
that each is Il-constrained or F itself is Ill-constrained. In any case F is 

Ill-constrained. 

However F is not Ill-constrained by assumption. Therefore we obtain that at 
least one is non-trivial, 2-false under a, and not Ill-constrained. Moreover, 
Fi^=i has strictly less false clauses than F. Induction on the number of false 
clauses of F shows that finally we arrive at a formula that is not any more 
3-false under a and hence satisfiable. □ 

Like the initial version of the procedure Search{F, a, r) defined in Sect.|2l the 
new version takes as input a formula F in 3-CNF with n variables, an initial 
assignment a, and a radius r. It returns true if F has a satisfying assignment 
inside the ball of radius r around a. If F is unsatisfiable, the procedure returns 
false. 
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Procedure Search{F, a,r) 

1. If is not 2-false under a then return true. 

2. If r < 0 then return false. 

3. If F contains the empty clause then return false. 

4. If F is not Ill-constrained with respect to a then return true. 

5. If F is I-constrained with respect to a then branch on a false clause that 
certifies that F is I-constrained. 

6. If F is Il-constrained with respect to a then branch on a false clause that 
certifies that F is Il-constrained. 

7. Branch on a false clause that certifies that F is Ill-constrained. 

Here the notion of branching is modified as follows. If we branch on a 
(0, 0, 0)-clause Zi V V I 3 such that F has the I-clause k then we do not run 
Search{Fi.^i,a, r — I). 

The correctness of the procedure follows from Lemma Q and 0 by induction 
on r. To estimate the number of leaves of the recursion tree, we use the function 
F[ defined by recursion as follows: F(0) = 1, F(l) = 3, F(2) = 9, and for r > 3 

H{r) = 6 • (F(r -2) + H{r - 3)). 



Lemma 5. Let L{F,a,r) be the number of leaves of the recursion tree of the 
procedure Search{F,a,r). Then we have L{F,a,r) < F[{r) for all r. 

Proof. We first show that 2 • iJ(r — 1) < F(r) for all r > 1. This holds for r = 1, 
r = 2 and r = 3. For r > 3 the inequality is proved by easy induction. Second 
we show that 2 • {H{r — 1) + H{r — 2)) < H{r) for r >2. This can be calculated 
directly for r = 2, 3, 4. For r > 4 we use induction: 

2-H{r-l)+2-H{r-2) 

= 12 • H{r - 3) -f 12 • H{r -A) + 12- H{r - A) + 12 ■ H{r - 5) 

< H{r) (induction hypothesis) 

The claim of the lemma now holds for r = 0, 1, 2. For induction on r we proceed 
as follows. If F is not 2-false under a, we have one leaf in the recursion tree of 
Search{F, a,r) and the claim holds. The same applies when F has the empty 
clause. If F is not Ill-constrained with respect to a, we again have only one leaf 
and the claim holds. 

Otherwise F is I-, II-, or Ill-constrained. The claim follows by induction, 
applying the inequalities above and the definition of F. □ 

To obtain an explicit bound on the size of the recursion tree, we solve the 
recurrence for H . If a satisfies the equation = 6-a-|-6, we assume H{m) = a"* 
for m < r and derive F[{r + 1) = The initial conditions are taken care 

of when we bound F(r) < 9 • o’" for all r where a is the unique (some calculus 
required) positive solution of the cubic equation above. One can calculate that 
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a = -^+ which is between 2.847 and 2.848. Hence the induction base holds. 
For the induction step observe that 

i7(r + 1) < 6 • 9 • + 6 • 9 • (induction hypothesis) 

= 9 • 

Choosing p = 0.26, we obtain a satisfiability algorithm running in time expo- 
nential in n where the base of the exponent is 2^^" times 

^ 2 . 848 *^'^® . 2 i“^( 0 - 26 )^ 

Therefore, the running time of our 3-SAT algorithm is bounded by 1.481”. 

6 Conclusion 

In the fourth author has shown that the idea of local search yields proba- 
bilistic algorithms whose running time is the best known for probabilistic 3-SAT 
algorithms. Here we show that local search can be also used to obtain fast de- 
terministic algorithms for fc-SAT. Similarly to HS|, it is possible to extend our 
approach to the more general class of constraint satisfaction problems. In con- 
trast to other deterministic /c-SAT algorithms, our basic algorithm presented in 
Sect. 0 is technically very simple and has a better running time. 

The improvement for 3-SAT given in Sect. Ocan be generalized to A:-SAT. It 
is an open problem whether this method can be used to improve the complexity 
of the probabilistic algorithm from [Ej (either for 3-SAT or for fc-SAT). 
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Abstract. In this paper we introduce a new technique to solve lattice 
problems. The technique is based on dual HKZ-bases. Using this tech- 
nique we show how to solve the closest vector problem in lattices with 
rank n in time n\ ■ where s is the input size of the problem. This is 
an exponential improvement over an algorithm due to Kannan and Hel- 
frich |1 tilir)[ . Based on the new technique we also show how to compute 
the successive minima of a lattice in time n! • 3" • where n is the 

rank of the lattice and s is the input size of the lattice. The problem 
of computing the successive minima plays an important role in Ajtai’s 
worst-case to average-case reduction for lattice problems. Our results 
reveal a close connection between the closest vector problem and the 
problem of computing the successive minima of a lattice. 



1 Introduction 

In this paper we study two classical problems from the geometry of numbers, 
the closest vector problem (Cvp) and the shortest linearly independent vectors 
problem (Sivp). In the closest vector problem we are given a lattice L and some 
vector t in the R- vector space span(L) spanned by the vectors in L. We are asked 
to find a vector u G L, whose Euclidean distance to t is as small as possible. To 
define the second problem, first we need to define the successive minima Afc(L) 
of a lattice of L. We call the dimension of span(L) the rank of L. Let k be an 
integer less than or equal to the rank of L. The k-th successive minimum of L is 
the smallest real number r such that L contains k linearly independent vectors of 
Euclidean length at most r. In the shortest linearly independent vectors problems 
we are given a lattice L with rank n. We are asked to find n linearly independent 
vectors Vi, . . . , v„ such that the length of is Afc(L). It is a well-known fact, 
that vectors with this property always exist (see |n|). 

Next to the shortest vector problem, Cvp is the most intensively studied 
problem in the algorithmic geometry of numbers. Cvp is an NP-hard problem 
(see for example fSl)- Cvp is also hard to approximate (see ptMLIp . The best 
result in this direction is due to m, who show that CvP can not be approxi- 
mated within a factor of where n is the rank of the lattice. On the 

* Work done while at Institut fiir theoretische Informatik, ETH Zurich, Switzerland 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 248-E3 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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positive side, Babai 0 showed that in polynomial time the distance of a vector 
t G span(L) to its closest vectors in L can be approximated within a factor of 
(3/\/2)"- Babai’s algorithm is based on the LLL-algorithm. The best algorithm 
to determine the exact distance from t G span(L) to L is due to Kannan m, 
with improvements partly due to Helfrich m Kannan’s algorithm achieves a 
running time of where n is the rank of L and s is the representation size 

of t and L. Kannan’s algorithm is based on so-called HKZ-bases for lattices. An 
HKZ-basis for L is a basis for L consisting of “short” vectors (a precise definition 
is given in Section H- 

The first result of this paper is an improvement of Kannan’s algorithm. We 
describe an algorithm that computes a vector in L closest to t in time nl ■ 

Since nP' jnX ~ e", this is an exponential improvement over Kannan’s result. The 
new algorithm is simple and, unlike previous algorithms for lattice problems, it 
uses dual HKZ-bases. A dual HKZ-basis is a basis, whose dual is a HKZ-basis of 
the dual lattice L* (see Section El for exact definitions). The algorithm is based 
on the following two facts. 

n 

1. Assume t = Qj G Q, where [bi, . . . , b„] is a dual HKZ-basis for L. 

i=i 

n 

Ifu = ^ Cjhj,Cj G Z, is a vector in L closest to t, then 
f=i 



\qn-Cn\<n/2. (1) 

2. Assume c G Z and v is a vector in the lattice L' spanned by the vectors 
[bi, . . . ,b„_i] such that c • b„ -|- v is a vector in L closest to t, then v is a 
vector in L' closest to the orthogonal projection of t — c • b„ onto span(L'). 

The second fact is straightforward. The first one follows from so-called trans- 
ference bounds. Transference bounds relate fundamental constants of a lattice L 
and its dual L* (see cm). From the above mentioned facts one gets the follow- 
ing algorithm for Cvp. For all integers c with \qn — c\ < n/2, recursively compute 
a vector Vc in L' closest to the orthogonal projection of t — c • b„ onto L'. Among 
the vectors Vc -I- c • b„ determine one closest to t. 

The new technique based on dual HKZ-bases also sheds light on the com- 
plexity of Si VP. Although the successive minima of a lattice are a classical notion 
in the geometry of numbers, until recently the complexity of Sivp has received 
little attention. This changed with Ajtai’s discovery of the connection between 
the average-case complexity of the shortest vector problem (Svp) and the worst- 
case complexity of SivP (see 1112112110111! among others). This connection also 
opened up the possibility of developing provably secure cryptographic primitives 
based on the single assumption P yf NP (see for example jSj). Hence, understand- 
ing the complexity of Sivp is an important issue in complexity theory as well as 
cryptography. 

In 0 it was shown that Sivp is NP-hard. Moreover, it was shown, that 
A„(L) is hard to approximate within a factor of Here n is the 
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rank of the lattice L. On the positive side, the LLL-algorithm approximates 
all successive minima of a lattice within a factor of 2"/^. The only previously 
published algorithm, we are aware of, that solves Sivp exactly is due to Pohst 
m- However, this algorithm predates the LLL-algorithm and no analysis for 
the algorithm was given. An algorithm of Helfrich m to compute a so-called 
Minkowski-basis of a lattice can be modified in order to solve SivP. This modified 
algorithm solves Sivp in time n” • where n is the rank of L and s its 

representation size. 

In this paper, we observe that Sivp closely resembles Cvp. In fact, the prob- 
lem Sivp is a special case of the following generalization of Cvp. As in Cvp, 
we are given a lattice L and a vector t G span(L). Moreover, we are given an 
affine space A C span(L). We are asked to find a vector u in L\A closest to t. 
That is, we are looking for a closest vector to t in L avoiding the affine space A. 
We call this problem the generalized closest vector problem (Gcvp). The con- 
nection between GcvP and Sivp is as follows. Suppose we already know linearly 
independent lattice vectors Vi,... ,Vk-i, where the length of is Ai(L). It is 
well-known that there is always a lattice vector linearly independent from 
Vi,i < k — 1, with length Afc(L). We can characterize Vfc as follows, is in- 
dependent from Vi, i < fc — 1, iff Vfc ^ S = span(vi,... ,Vfc_i). Moreover, it 
must be a vector in L\S closest to the origin 0 . Hence, Vfc can be described 
as a solution to an instance of Gcvp. It might seem more natural to describe 
Vfc as a shortest vector in L\S. However, we will give a recursive algorithm for 
Sivp and Gcvp that closely resembles the algorithm for Gvp described above. 
In this recursive algorithm, even if we start with a linear subspace S and target 
vector 0, we need to solve instances of GcvP with arbitrary target vectors and 
arbitrary affine subspaces. 

Our algorithm for Sivp and GcvP achieves a running time of 3” • n! • s^^^\ 
This differs from the running time for GvP by a factor of 3". This difference 
is caused by a weaker version of equation Q for GcvP. The running time we 
achieve is exponentially better than the running time one achieves by using the 
technique of Helfrich based on HKZ-bases. Moreover, the algorithm indicates 
that Sivp is at least as difficult as Gvp. 

Let us briefly summarize the main contributions of this paper. 

1. We introduce a new technique for solving lattice problems. This technique 
is based on dual HKZ-bases. 

2. Based on the technique, we give an improved algorithm for the closest vector 
problem. 

3. We give a new algorithm for the shortest independent vectors problem. 

4. We show a close connection between Sivp and Gcvp. This in turn hints at 
a connection between Sivp and Gvp. 

The last point immediately leads to the main open question this paper raises. Is 
the problem Sivp polynomial-time reducible to Gvp, or vice versa? More gener- 
ally, is Gcvp polynomial-time reducible to Gvp? It would also be interesting to 
see whether other lattice problems can be solved using dual HKZ-bases. A good 
candidate is the computation of a Minkowski-basis of a lattice (see [ 1 1. 
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The paper is organized as follows. In Section |2l we state the most important 
facts used in this paper. In Section 0 we describe the algorithm for the closest 
vector problem. The solution to the shortest independent vectors problem and 
the generalized closest vector problem is given in Section 0 



2 Basis Definitions and Facts 



In this section we define several fundamental concepts and state important re- 
sults from the geometry of numbers that will be used throughout this paper. 
Given a lattice L and a target vector t G span(L), by /r(t,L) denote the Eu- 
clidean distance of a closest vector in L to t. 

Definition 1. The number /r(L) := max /i(t, L) is called the covering radius 

tGspan(L) 

o/L. 

Given a lattice L and a basis [bi, . . . ,b„] of L, we denote the sublattice with 
basis [bi, . . . , b^] by L^, k = 1, . . . ,n. Hence L„ = L. The dual lattice L* of L is 
defined as 

{v G span(L) : (v, w) G Z for all w G L} . 

For a basis [bi, . . . , b„] of L there is a unique basis [b*, . . . , b*] of L* satisfying 



(fy,b*) 



1 if i + j = n + 1, 
0 otherwise. 



This basis is called a dual basis to the basis [bi, . . . ,b„]. 

The following theorem is the most important ingredient in the analysis of our 
algorithms. Results of this type are usually referred to as transference bounds, 
since they relate fundamental constants of L to fundamental constants of L*. 
The version we state here is due to Banaszczyk 0 . Weaker versions were proved 
by Lagarias, Lenstra, and Schnorr HE|. 

Theorem 1 (Transference bounds). For every lattice L of rank n and its 
dual L* the following inequalities hold 

1. Ai(L) • A„_i+i(L*) <n, i = l,... ,n. 

2 . ^(L) • Ai(L*) < n/2. 

For n — >■ 00 , the right-hand sides in the theorem both can be replaced by n/(27r). 
Using these bounds the running times of our algorithms can be improved. How- 
ever, in order to keep the exposition simple, we will use the bounds given in 
Theorem n valid for any lattice. 

For a lattice L in R™ with basis [bi, . . . , b„] and for 1 < i < n we define the 
lattices := 7Ti(L), where : R™ — >■ span(Li_i)'^ denotes the orthogonal 

projection onto the orthogonal complement of span(Lj_i). The lattice 
has rank n—i+1 and a basis is given by [7Ti(bi), . . . , 7Ti(b„)]. For a vector v G L we 
denote 7Ti(v) by v(i). Moreover, 7Ti(bi) = bj = bi(i) for basis vectors bi, . . . , b„. 
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The vectors [b| , . 
The numbers 



M] are the Gram-Schmidt orthogonalizationoi [bi, . 



•— 



(bi,b]) 

(bl,b])’ 



1 < j < i < n, 



j b„] . 



are called the Gram-Schmidt coefficients of [bi, . . . ,b„] A basis [bi, . . . ,b„] is 
called weakly reduced iff <1/2 for 1 < j < i < n. 



Definition 2. A basis [bi, . . . ,b„] of a lattice L is called an HKZ-basis (Her- 
mite, Korkin, Zolotarev) iff 



(i) [bi, . . . ,b„] is weakly reduced. 

(a) b| is a shortest non-zero vector in i = 1 ,... ,n. 



A basis [bi, . . . ,b„] of a lattice L is called a dual HKZ-basis, iff the dual basis 
[bj, . . . ,b*] is an HKZ-basis of the dual lattice L*. 

By (ii), the first vector bi of an HKZ-basis [bi, . . . ,b„] is a shortest vector of 
the lattice L. The following lemma will prove to be useful. It was used without 
proof in HB|. A proof for it will be contained in the full version of this paper. 

Lemma 1. If [bi, . . . , b„] is a dual HKZ-basis for L, then [bi, . . . , bfc] is a dual 
HKZ-basis for L^, k = 1, . . . ,n. 



Next we need to say something about representations and representation 
sizes of vectors, lattices, and affine subspaces. We always assume that a lattice 
L C Q™ is given by a basis [bi, . . . ,b„] of L. The representation size s of a 
lattice L C Q™ with respect to the basis [bi,... ,b„] is the maximum of m 
and the binary lengths of the numerators and denominators of the coordinates 

n 

of the basis vectors hj. Given a vector t = Qj^j^ Qj G Q, the representation 

i=i 

size of t with respect to [bi, . . . ,b„] is the maximum of m and of the binary 
lengths of the numerators and denominators of the coefficients qj . In the sequel, 
if we speak of the representation size of a lattice L, or of a vector t G span(L), 
without referring to some specific basis, we implicitly assume that some basis 
[bi, . . . , b„] for L is given. Finally, an affine space A is represented as u -|- S for 
some vector u G Q™ and some linear subspace S. The subspace S is represented 
by some basis [wi, . . . ,Wfc],A: < m. The representation size of A with respect 
to u and S is the maximum of m and the binary lengths of the numerators and 
denominators of the coordinates of the vectors u.Wj. 

The only algorithmic tool we need in this paper is an algorithm to compute 
an HKZ-basis for a lattice L. Recall that the first vector of an HKZ-basis of L is 
a shortest vector in L. Incorporating improvements by Helfrich, an algorithm by 
Kannan obtains the following running times (see CM). 

Theorem 2. Given a lattice L G Q"* with rank(L) = n and representation size 
s, then an HKZ-basis o/L can be computed with arithmetic operations 

on integers of size . In particular, a shortest vector in L can be computed 
within the time bounds stated above. 
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3 The Closest Vector Problem 



In this section we present the algorithm solving the closest vector problem. In 
the following section we will present the algorithm for the generalized closest 
vector problem. The algorithms are almost identical. But the basic ideas are 
clearer in case of the (classical) closest vector problem. Also, the running time 
we achieve for the closest vector problem is better than for the generalized closest 
vector problem. First we state and prove the lemmas crucial to our approach as 
outlined in Section H 



Lemma 2. Let L be a lattice of rank n with dual HKZ-basis [bi, . . . ,b„] and let 

n n 



t be a vector in span(L), t 
in L closest to t, then 



e Q. //u = ^ Cjhj,Cj G Ia, is a vector 
i=i 

\qn - C„| < n/2. 



Proof. Let [b*,... ,b*] be the basis dual to [bi,... ,b„]. By definition of a 
dual basis, \qn — Cn\ = |(t — u,b*)|. Recall that the first vector in an HKZ- 
basis of a lattice is a shortest vector of the lattice. Hence, in the present case, 
||b*|| = Ai(L*). Applying the Cauchy-Schwarz-inequality, we obtain 



|g„ - c„| = |(t - u,K)| < lit - u|| . IIKII < m(L) . Ai(L*). 



The transference bound (Theorem Q proves the lemma. 



□ 



The next lemma is the key to the recursive step of our algorithm for the closest 
vector problem. 

Lemma 3. Let L be a lattice of rank n with dual HKZ-basis [bi, . . . , b„] and 
let t be a vector in span(L). Fix some integer c and denote by w the orthogonal 
projection oft — c- b„, c £ Z, onto span(L„_i). The vector v -|- c • b„, v £ L„_i, 
is a closest vector to t of the form u -|- c • b„, u £ L„_i, iffw is a vector in Ln-i 
closest to w. 



Proof. By definition w = t — c • b„ — t(n) -I- c • bj^. Consider an arbitrary vector 
of the form u -|- c • b„,u £ L„_i. Since u(n) = u, the squared distance from 
u + c • b„ to t is given by ||t — c • b„ — up = ||t(n) — c • b^ p -|- ||w — up. The 
term ||t(n) — c •blP is independent of the choice of u. Hence, for v £ L„_i we 
get 



||v-|-c-b„ -t|| = min ||u + c • b„ - t|| ||v - w|| 

u£L„_i 



min ||u — w||. 

uGL„_i 



This proves the lemma. □ 

Now we are in a position to state the algorithm for the closest vector prob- 
lem, prove its correctness, and analyze its running time. The input to the al- 
gorithm is a dual HKZ-basis [bi,... ,b„] of the lattice L and a target vector 

n 

t e Q. 
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Algorithm Closest Vector 



IF n = 1 THEN 

Compute the closest integer c to qi and return c • bi. 

ELSE 

Set C := 0. 

FDR c = -n/2 TO n/2 DO 

Compute the orthogonal projection w of t — c • b„ onto span(L„_i). 
Recursively call Algorithm Closest Vector with input basis 
[bi, . . . ,b„_i] and target vector w. 

Set C := C U {v + c • b„}, where v is the vector returned by the 
recursive call. 

END 

Compute and return an element in C closest to t. 

END 



Next we show the correctness of the algorithm. 

Lemma 4. Assume the input to Algorithm Closest Vector is a dual HKZ-basis 

n 

[bi, . . . , b„] of the lattice L and a target vector t = Qjbj, qj G Q. Then the 

f=i 

algorithm computes a vector in L closest to t. 

Proof. We show the correctness of the algorithm by induction on the rank n. For 
n = I the algorithm correctly computes closest vectors. So assume the algorithm 
correctly computes closest vectors for all lattices of rank < n. Now assume L is 
a lattice with rank n. By LemmaH [bi, . . . ,b„_i] is a dual HKZ-basis of L„_i 
and, by induction assumption, the recursive calls to the algorithm return closest 
vectors to the orthogonal projections of t — c - b„, c G Z, |c| < n/2. By Lemma 0 
and LemmaO a vector in L closest to t can be written as u-|-c-b„ with |c| < n/2 
and u an arbitrary vector in L„_i closest to the orthogonal projection of t — c-b„ 
onto span(L„_i). Hence the last line in the algorithm correctly computes a vector 
in L closest to t. □ 

The running time of Algorithm Closest Vector is analyzed as follows. 

Lemma 5. Assume the lattice L C M"* and the target vector t have represen- 
tation size s with respect to the dual HKZ-basis [bi,... ,b„]. Then Algorithm 
Closest Vector uses at most n! • arithmetic operations on integers of size 

s^W, 

Proof. By T(n,m) denote the maximal number of arithmetic operations Algo- 
rithm Closest Vector uses for lattices L C M"* with rank n and target vectors 
t G span(L). Since projections can be computed with arithmetic opera- 
tions and the last step in Algorithm Closest Vector also requires arithmetic 

operations, for T{n,m) we obtain the following recurrence relation 

T{n, m) = n ■ T{n — 1, m) -F 
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For lattices L C with rank 1 the algorithm uses arithmetic operations. 

Hence, the recurrence for T{n,m) solves to n! • proving our claim on the 

number of arithmetic operations. 

The analysis of the size of the integers involved is straightforward and omitted 
in this extended abstract. □ 

For Algorithm Closest Vector we assume that we are already given a dual HKZ- 
basis of L and that the target vector t is represented as a linear combination 
of the vectors in the dual HKZ-basis. However, combining Algorithm Closest 
Vector with the Kannan/Helfrich algorithm to compute HKZ-bases we obtain 
the following theorem. 

Theorem 3. Given a lattice L with rank n and a vector t £ span(L) both with 
representation size s. A closest vector to t in the lattice L can be computed with 
n\ ■ arithmetic operations on integers of size . 

Proof. Since L has representation size s, the dual lattice L* has representation 
size By Theorem^ an HKZ-basis [b^, . . . , b*] for L* can be computed with 
^n/ 2 gO(i) arithmetic operations on integers of size From this basis within 

the bounds stated in the theorem we compute a dual HKZ-basis [bi, . . . , b„] of 
L. Also the representation of t with respect to the basis [bi, . . . , b„] can be com- 
puted within the time bounds stated in the theorem. In particular, with respect 
to the new basis [bi, . . . ,b„] the lattice L and the vector t have representation 
size Next we apply Algorithm Closest Vector to the dual HKZ-basis com- 

puted previously and target vector t to obtain a vector in L closest to t. The 
theorem follows from Lemma 0 □ 

Note that once an HKZ-basis for L* has been computed, the number of arith- 
metic operations used by the algorithm is independent of s. This is a feature our 
algorithms shares with Kannan’s algorithm for the closest vector problem. Kan- 
nan’s algorithm needs arithmetic operations on integers 

of size s^^^\ Hence our algorithms uses a number of arithmetic operations that 
is by a factor of roughly jn\ ~ e" smaller than the number of operations used 
by Kannan’s algorithm. 

4 Successive Minima and the Generalized Closest Vector 
Problem 

In this section we describe the algorithms for the generalized closest vector prob- 
lem and the shortest independent vectors problem. Since the basic ideas and 
proofs are almost identical to the ones presented in the previous section we will 
be brief. First, we need to generalize Lemma El and Lemma El appropriately. 

Lemma 6. Let L C K.™ be a lattice of rank n with dual HKZ-basis [bi, . . . ,b„], 
let A C K"* be an affine subspace with L ^ A, and let t be a vector in span(L), 

n n 

t eQ. lfu = J2 Cj € h., is a vector in L\A closest to t, then 

i=i 

IQu ^ 3/2 • n. 
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To prove this lemma, first we prove two auxiliary lemmas. 

Lemma 7. Let L C K.™ be a lattice. If A is an affine subspace o/R"* with L ^ A, 
then the affine space A' = A fl span(L) has dimension at most n — 1. 

Proof. Since L ^ A, we conclude that span(L) ^ A. Hence the affine space 
A' = A n span(L) is a proper subset of span(L). This shows that the dimension 
of A' is at most n — 1. □ 

Lemma 8. Let L be a lattice with rank n and let A' be an affine subspace of 
span(L) with dim(A') < n. For all t S span(L), a closest vector \i to t in the set 
L\A' has distance at most p,(L) + A„(L) to t. 

Proof. Let v be a closest vector to t in the lattice L. Let vi,... ,v„ G L be 
linearly independent with ||vi|| = Ai(L). Since Vi,f = 1, . . . ,n, are linearly inde- 
pendent, at least one vector in {v, v -|- Vi , . . . , v -|- v„} is not contained in A'. If 
V ^ A' the lemma is clearly true. So assume v + Vh,l < h < n, is not contained 
in A'. Since ||v — t|| < /i(L) by definition of the covering radius /r(L), 

||v -h V?, - t|| < ||v - t|| ||v/,|| < ^(L) -h A?,(L) < /r(L) A„(L), 



□ 



Proof (of Lemma |^. Applying Lemma [7| shows that the dimension of A' = 
span(L) n A is at most n — 1. Since L\A = L\A', Lemma El applied with lattice 
L, affine space A', and target vector t shows 

||u-t|| <^i(L) + A„(L). (2) 

Denote by [b* , . . . , b* ] a dual basis to [bi , . . . , b„] . Using the Cauchy-Schwarz- 
inequality we obtain 

|g„ - c„| = ||(t - u,bt)|| < lit - u|| . ||bl II < (ML) + A„(L)) • Ai(L*), 

where the last inequality follows from 0 and from ||b*|| = Ai(L*). Applying the 
transference bounds (Theorem 0 shows 



\qn - Cn\ < 3/2 • n. 



□ 

The next lemma generalizes Lemma El It is the key to the recursive step of our 
algorithm for the generalized closest vector problem. 

Lemma 9. Let L be a lattice of rank n with dual HKZ-basis [bi,... ,b„], let 
A C K™ be an affine subspace with L ^ A, and let t be a vector in span(L). Fix 
an integer c and denote by w the orthogonal projection o/ 1 — c • b„, c G Z, onto 
span(L„_i). The vector v -|- c • b„, v G L„_i, is a closest vector to t in L\A of 
the form u -|- c • b„, u G L„_i, iff v is a vector in L„_i\(— c • b„ -|- A) closest to 
w. 
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Proof. A vector u + c • b„,u £ \-n-i is contained in L\A, iff u is contained in 
L„_i\(— c • b„ + A). As in the proof for Lemma |3 we can now argue that for 
V G L„_i\(-c • b„ + A) we get 

||v + c • b„ - t|| = minugL„_i\(-c-b„+A) ||u + c • b„ - t|| 

||v - w|| = minugL„_i\(-c-b„+A) ||u - w||. 

This proves the lemma. □ 

Now we are in a position to state the algorithm solving the generalized closest 
vector problem. The input to the algorithm is a dual HKZ-basis [bi , . . . , b„] 

n 

of L, an affine subspace A C K™, and a target vector t = ''^qjhj,qj G Q. To 

i=i 

treat the case that L C A, we introduce a special symbol u>. The algorithms will 
output w, iff L C A. 

Algorithm Generalized Closest Vector 



IF n = 1 THEN 
IF L C A THEN 
Return lu 
ELSE 

Compute the closest integer c to qi 
IF c • bi ^ A THEN 
Return c • bi 
ELSE 

Compute the closest integer c' yf c to qi and return c' • bi. 

END 

END 

ELSE 

Set C := 0. 

FDR c = -3/2 • n TO 3/2 • n DO 

Compute the orthogonal projection w of t — c • b„ onto span(L„_i). 
Recursively call Algorithm Generalized Closest Vector with basis 
[bi, . . . , b„_i], affine space — c • b„ + A, and target vector w. 

IF the recursive call returns a vector v cu THEN 
Set C := C U {v + c • b„}. 

END 

IF C yf 0 THEN 

Compute and return a closest element in C to t. 

ELSE 

Return w 

END 



END 
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Next we show the correctness of the algorithm and analyze its running time. 

Lemma 10. Assume the input to Algorithm Generalized Closest Vector is a 
dual HKZ-basis [bi,... ,b„] of the lattice L, a target vector t represented as 

n 

t = ''^^qjhj,qj G Q, and an affine subspace A C K"*. Then Algorithm General- 
ized Closest Vector computes a closest vector in L\A to t. 

Proof. We only show that for n = 1 the algorithm correctly computes a closest 
vector in L\A. The remainder of the proof is almost identical to the proof of 
Lemma 0 

By Lemma Q if L ^ A then the dimension of span(L) fl A is either 0 or 
the intersection span(L) n A is empty. Therefore, L and A either have empty 
intersection or they intersect in a single point. From this the correctness of 
Algorithm Generalized Closest Vector for lattices with rank 1 follows. □ 



Lemma 11. Assume the lattice L C R.™ and the target vector t have representa- 
tion size s with respect to the dual HKZ-basis [bi, . . . ,b„]. Also assume that the 
representation size of A is s. Then a vector in L\A closest to t can be computed 
with 3" • n! • arithmetic operations on integers of size s^^^\ 

Proof. The result follows exactly as Lemma 0 with the recurrence T{n,m) = 
T{n—l,m)-\-mP^^'> replaced by the recurrence T(n,m) = 3-n-T{n—l,m)-\-m^^^\ 

□ 



Finally we obtain 

Theorem 4. Given a lattice L with rank n, an affine space A C M"*, and a 
vector t G span(L), all with representation size s. A closest vector to t in L\A 
can be computed with + arithmetic operations on integers 

of size . 

The proof of this theorem is the same as the proof for Theorem0. As mentioned 
in Section n Theorem 0 implies 

Theorem 5. Given a lattice L with rank n and representation size s. Linearly 
independent vectors Vi,... ,v„ G L with Ai(L) = ||vi|| can be computed with 
3" • n! • arithmetic operations on integers of size . 
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Abstract. Whenever we have data represented by constraints (such as 
order, linear, polynomial, etc.), running time for many constraint pro- 
cessing algorithms can be considerably lowered if it is known that certain 
variables in those constraints are independent of each other. For example, 
when one deals with spatial and temporal databases given by constraints, 
the projection operation, which corresponds to quantifier elimination, is 
usually the costliest. Since the behavior of many quantifier elimination 
algorithms becomes worse as the dimension increases, eliminating certain 
variables from consideration helps speed up those algorithms. 

While these observations have been made in the literature, it re- 
mained unknown when the problem of testing if certain variables are 
independent is decidable, and how to construct efficiently a new rep- 
resentation of a constraint-set in which those variables do not appear 
together in the same atomic constraints. Here we answer this question. 
We first consider a general condition that gives us decidability of vari- 
able independence; this condition is stated in terms of model-theoretic 
properties of the structures corresponding to constraint classes. We then 
show that this condition covers the domains most relevant to spatial 
and temporal applications. For some of these domains, including linear 
and polynomial constraints over the reals, we provide a uniform decision 
procedure which gives us tractability, and present a polynomial-time al- 
gorithm for producing nice constraint representations. 



1 Introduction 

We start with a simple example. Suppose we have a set S' C given by simple 
order-constraints <p{x,y) = (0 < a; < 1) A (0 < y < 1). Suppose we want to 
find its projection on the x axis. This means writing the formula 3y ip{x,y) as 
a quantifier-free formula. This can be done, in general, because the theory of 
(R, <, (r)j.gR) admits quantifier elimination. But in this particular case it is very 
easy to find a quantifier-free formula equivalent to 3y (p{x, y) using just standard 
rules for equivalence of first-order formulae: 

3y if{x,y) O (0 < a: < l)A3y (0 < y < 1) O (0 < a: < l)Atrue O 0 < a; < 1. 

* Part of this work was done while visiting INRIA. 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 2fiO- E7il 2000. 
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Now notice that ip can be considered as a formula in the language of the real 
field (R, +, •, 0, 1, <) whose theory also admits quantifier elimination. Suppose 
then that instead of p, we are given an equivalent formula ^/>(x,y): 

f(0 < X < 1) A (0 < 2/ < 1) A (4x^ — y — 1 > 0)) 

V ((0 < X < 1) A (0 < 2/ < 1) A (4x^ — 2/ ~ 1 ^ 0)) • 

The first step of quantifier elimination for 3y ip is easy, as we propagate 3y 
inside the disjunction. However, trying to find a quantifier-free equivalent for 
the first disjunct, that is, a formula equivalent to 3y ((0 < x < 1) A (0 < 
2 / < 1) A (4x^ — y — 1 > 0)), one immediately encounters obstacles. Unlike 
the earlier example, this one requires a bit of thought to come up with the 
answer (0.5 < x < 1). Similarly, some work is needed to compute the answer 
(0 < X < l/-\/2) for the second disjunct. 

Why is it that the first quantifier-elimination procedure is completely ele- 
mentary, and the second is not, even though both p and ip define the same set? 
The reason is that in the first representation of S, variables x and y are inde- 
pendent, that is, they do not appear in the same atomic formulae. This makes 
quantifier elimination easy. In the second case, x and y do appear together in 
the same term — Ay — 1, and this is what causes the problem. 

This extremely simple observation can often make constraint processing eas- 
ier. While it can conceivably be useful in various tasks such as more efficient 
variable elimination in constraint logic programming 1 2j , here we concentrate 
on one application area, namely constraint databases j1 511 4| where it found its 
way into a practical system for querying spatio-temporal databases |Sj. The main 
goal of constraint databases is to model infinite database objects, which arise in 
a variety of applications, for example, in Geographical Information Systems. 

A particular constraint model is defined over a structure M = {U, f2) (where 
U is the universe and 17 is the vocabulary) which is typically required to have 
quantifier elimination. Those considered most often in spatial application are the 
real field R = (R, -b, •, 0, 1, <) and the real ordered group Rii„ = (R, +, — , 0, 1 <), 
which give rise to polynomial and linear constraint databases, respectively. A 
constraint relation of arity n is simply a definable subset of IT^, that is, a set 
of tuples a G C/" that satisfy a first-order formula. For the above structures, 
constraint relations are semi-algebraic sets for R, and semi-linear sets for Rii„ 
P]. A constraint database is a finite set of constraint relations. 

A standard constraint query language over M. is FO-I-A4, that is, first-order 
logic in the language of A4 and symbols for relations in a constraint database. 
For example, if a database contains a single ternary symbol S', the query p{x) = 
3u, V 'iy, z (S(x, y,z) gg z = u ■ y -\- v) finds all a such that the intersection of S 
with the plane x = a is a line. Note that if S is a semi-algebraic set, then so is 

One of the standard database operations is projection. In the language of 
constraint processing, it corresponds to quantifier elimination. That is, given a 
quantifier- free formula (p{y, xi, . . . , x„_i), one wishes to find a quantifier- free for- 
mula ip{x) equivalent to 3y ip{y, x). In many cases, the complexity of algorithms 



262 



L. Libkin 



to find such a ■i/' is of the form where N is the size of the formula, and 

/ is some function. For example, if one uses cylindrical algebraic decomposition 
0 for the real field, / is 0(2"). In general, even if better algorithms are avail- 
able, the complexity of constraint processing often increases with dimension to 
such an extent that it becomes unmanageable for large datasets (see, e.g., my 
Assume now that x is split into two disjoint tuples u and v such that {y,u) 
and V are independent, that is, they do not appear in the same atomic formulae. 
Then ip is equivalent to a formula of the form 

k 

(1) Y ai(y,u) A/3i(F). 

Therefore, the formula 3y ip is equivalent to 

k 

(2) \! {3y aiiy,u)) hl3i{v). 

i=l 

For a number of operations this is a significant improvement, as the exponent 
becomes lower. For example, in addition to quantifier elimination, data often 
has to be represented in a nice format (essentially, as union of cells |2]), and 
algorithms for doing this also benefit from reduction in the dimension [t)l 1 1 Ij . 

Even though such a notion of independence may seem to be too much of a 
restriction, from the practical point of view it is sometimes necessary to insist on 
it, as the cost of general quantifier elimination and other operations could be pro- 
hibitively expensive. For example, the Dedale constraint database system ^ 
requires that the projection operation only be applied when u consists of a single 
variable. Dealing with spatio-temporal applications, one often queries trajecto- 
ries of objects, or cadastral (land-ownership) information. These are typically 
represented as objects in given by formulae ip{x,y,t). To be able to com- 
pute 3y ip(x, y, t), one approximates iphy & formula ip{x, y, t) which is a Boolean 
combination of formulae ai(cc, y) and Pi(t). For trajectories, this amounts to say- 
ing that an object is in a given region during a given interval of time; thus, it 
is the information about the speed that is lost in order to have efficient query 
evaluation. As was further demonstrated in UDI, the difference between the case 
when at most 2 variables are dependent, and that of 3 or more variables being 
dependent, is quite dramatic, in the case of linear and polynomial constraints. 

What is missing, however, in this picture, is the ability to determine whether 
a given constraint representation of the data can be converted to the one in the 
right format, just as in our first example, ip{x, y) is equivalent to ip{x^ y), in which 
variables x and y are independent. It was claimed in j5| that such a procedure 
exists for linear constraints, and then m gave a simpler algorithm. However, uni 
then showed that both claims were incorrect. It was thus not known if variable 
independence can be tested for relevant classes of constraints. 

Our main goal here is to show that variable independence can be tested for 
many classes of constraints, and that algorithms for converting a given formula 
into a one in the right form can be obtained. Moreover, those algorithms often 
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work in time polynomial in the size of the formula (assuming the total number 
of variables is fixed). Among structures for which we prove such results are the 
real ordered group, the real field, as well as (Z, +, 0, 1, <) extended with all the 
relations x = ?/(mod k), k > 1 (which is used in temporal applications). Even 
if those algorithms are relatively expensive, it is worth putting data in a nice 
format for two reasons. First, such an algorithm works only once, and then the 
data is repeatedly queried by different queries, which can be evaluated faster. 
Secondly, some queries are known to preserve variable independence; hence, this 
information can be used for further processing the query output. 

Organization In Section El we define the notion of variable independence, and 
more generally, the notion ~ P of a formula g) respecting a certain partition 
P of its free variables. Then, in Section 0 we discuss requirements on the theory 
of M. that guarantee decidability of this notion, as well as the existence of an 
algorithm that converts a given formula into a one in the right shape. In Section 
0 we discuss specific classes of structures and derive some complexity bounds. 
In particular, we look at o-minimal structures m (which include linear and 
polynomial constraints over the reals) and give a uniform decision procedure. 
This procedure gives us tractability, and we also show how to find an equivalent 
formula in the right shape in polynomial time. All proofs are only sketched here; 
complete proofs are in the full version ira 



2 Notations 

All the definitions can be stated for arbitrary first-order structures, although for 
the algorithmic considerations we shall require at least decidability of the theory, 
and often quantifier elimination. 

Given a structure Ai = (where C/ is a set always assumed to be 

infinite, and f? can contain predicate, function, and constant symbols, and is 
always assumed to be a recursive set) , we say that the theory of M is decidable 
if for every first-order sentence ^ in the language of Ai it decidable \i Ai 'p 
We say that Ai admits (effective) quantifier elimination if for every formula g}{x) 
in the language of Ai, there exists (and can be effectively found) a quantifier- free 
formula if{x) such that Ai \='ix >^{x) V'(^)- 

Given a formula ip{x,y) in the language of Ai, with x of length n and y of 
length m, and a G C/", we write ip{a, Ai) for the set {b G f/"* | Ai [= if{a, 6)}. In 
the absence of variables x we write <f{Ai) for {h\ Ai \= </?(&)}• Sets of the form 
<f{Ai) are called definable. A function / : is definable if its graph 

{(a, 6) G I 6 = /(a)} is a definable set. 

Given a tuple of variables x = {xi, . . . , Xn) and a partition P — {Pi, . . . , Pm} 
on {!,..., n}, we let xsi stand for the subtuple of x consisting of the XjS with 
j € Bi. For a formula ^{xi , . . . , Xn), we then say that ip respects the partition P 
(over Ai) if is equivalent to a Boolean combination of formulae each having 
its free variables among xsi for some i < k. This will be written as p P, or 
just (fi ^ P if Ai is clear from the context. 
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In other words (by putting a Boolean combination into DNF), ip P if 
there exists a family of formulae a* i = 1, . . . , m, j = 1, . . . , fc, such that 



When M. has quantifier elimination, all a*s are quantifier free. In fact, under 
the quantifier-elimination assumption, the definition of (p P can be restated 
as the equivalence of to a quantifier-free formula ip such that every atomic 
subformula of ip uses variables from only one block of P. 

We say that in ip, two variables Xi and Xj are independent if there exists a 
partition P such that p Pi E^nd Xi and Xj are in two different blocks of P. 
Equivalently, Xi and Xj are independent if there exists a partition P = [y, z) of 
X such that p Pj is in y and Xj is in z. (When convenient notationally, 
we identify partitions on the indices of variables and variables themselves.) 

Structures. After presenting a general decidability result, we shall deal with 
several important classes of structures. Two of them were mentioned al- 
ready: the real ordered group Run = (K, -I-, — , 0, 1, <) and the real field 
R = (R, -b, •, 0, 1, <), corresponding to linear and polynomial constraints over 
the reals. Some of the results for these structures extend to a larger class of o- 
minimal structures: A4 = {U, Q) is called o-minimal [1 fipid] if one of the symbols 
in 17 is <, interpreted as a linear order on U , and every definable subset of C7, 
{a I M 1= p{a)}, is a finite union of points and open intervals. Both Run and 
R have quantifier elimination (by Fourier elimination j‘25j . and Tarski’s theorem 
03|, respectively), which easily implies that they are o-minimal. The exponen- 
tial field (R, -b, •, e’”) is an example of a structure which is o-minimal but 
does not have quantifier elimination For other o-minimal structures on the 
reals, see m 

We shall deal with some structures on the integers. Of most interest to us is 
Zq = tpi,, -b, — , 0, 1, <, (=fc)fe>i) where n =fc m iff n = m(mod k). This structure 
corresponds to constraints given by linear repeating points, which are used for 
modeling temporal databases m- The structure Zq admits effective quantifier 
elimination, and its theory is decidable 0. 

3 General Conditions for Deciding Variable Independence 

Given a structure M, we consider two problems. The variable independence 
problem YlM{p,Xi,Xj) is to decide, for p{xi, . . . ,Xn) in the language of At, 
if Xi and Xj are independent. The variable partition problem YP^i^p, P) is to 
decide, for a given formula p{xi, . . . ,Xn) and a partition P on {!,..., n}, if 
P P- 

Note that the variable independence problem is a special case of the variable 
partition problem, as to solve the former, one needs to solve the latter for some 
partition P = ( 81 , 82 ) with i G 81 and j £ 82 - 



k 



(*) 
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The above problems are just decision problems, but if the theory of A4 is 
decidable, and the answer to VPj\^(ip, P) is ‘yes’, one can effectively find a 
representation in the form (*), simply by enumerating all the formulae {tjj{x))i 
which are Boolean combinations of formulae having free variables from at most 
one block of P, and then checking if Af ^ Val (</3(af) O Since (p P, 

for some finite i, we get a positive answer. In many interesting cases, we shall 
see better algorithms for finding representation (*) than simple enumeration. 

The first easy result shows that the problems VI^vi {'P, Xi, xj) and VP^iPt P) 
are equivalent; this allows us to deal then only with two-block partitions. 

Lemma 1. For any A4, the variable independence problem is decidable over M. 
ijf the variable partition problem is decidable over A4. □ 

Next, we discuss conditions for decidability of the variable independence 
problem. It is clear that one needs decidability of the theory of A4. However, 
decidability alone (and even effective quantifier elimination) are not sufficient. 

Proposition 1. aj If the theory of M is undecidable, then the variable 

independence problem is undecidable over J\A. 

b) There exists a structure A4 with a decidable theory and effective quantifier 

elimination such that the variable independence problem is undecidable over 

M. 

Proof sketch, a) If is a sentence and p{x, y) is {x = y) A ~<<P, then x and y are 
independent in iff A1 ^ 

b) An example is provided by the theory of traces from |2I3 . Let P be a union 
of three disjoint sets: descriptions of Turing machines, input words, and traces, 
or partial computations of machines on input words, all appropriately coded as 
strings. Let 17 contain a constant symbol for every element of U, and a single 
ternary predicate P{m,w,t) saying that t is a trace of the machine m on the 
input word w. This signature can be expanded by finitely many symbols so that 
the expanded model has effective quantifier elimination. 

Now fix a Turing machine mo and an input word wq and consider the formula 
p{t,t') = {P{mo,wo,t) A {t = t')). We then show that t and t' are independent 
iff mg halts on wg. □ 

The proof of Proposition QJ b), shows that it is essential to be able to decide 
finiteness in order to decide VI(:p, Xi, Xj) (as it is the finiteness of the number of 
traces that turns out to be equivalent to variable independence). Recall that a 
formula p{x) is algebraic if p(A4) is finite. We say that there is an effective test 
for algebraicity in Ai if for every p{x) in the language of Af, it is decidable if 
p is algebraic. Note that this somewhat technical notion will trivially hold for 
most relevant classes of constraints. 

While the notion of variable independence is needed in the context of con- 
straint databases, for finite relational structures it is assumed to be meaningless 
as every tuple is represented as a conjunction of constraints of the form Xi = Ci, 
where CiS are constants. For example, the graph {(1, 2), (3, 4)} is given by the 
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formula ((a: = 1) A (j/ = 2)) V ((x = 3) A (y = 4)). Clearly, variables x and y are 
independent. 

However, over arbitrary structures, not every finite definable set would sat- 
isfy the variable independence condition. To see this, let A4 = (N, C, if), where 
C is a unary relation interpreted as {1,2} and if is a binary relation symbol in- 
terpreted as {(1, 2), (2, 1)}. A routine argument shows that this Ai has quantifier 
elimination, decidable theory, and there is a test for algebraicity. The formula 
(/j(x, y) = E(x, y) then defines a finite set, but variables x and y are not inde- 
pendent: this is because the only definable proper subsets of N are {1,2} and 
N — {1,2}, and no Boolean combination of those gives us E. As another exam- 
ple, consider the field of complex numbers, whose theory is decidable and has 
quantifier elimination [IB| . Let (f{x,y) = (x^ -1-1 = 0)A(?/^-|-1 = 0)A(x-|-y = 0). 
It defines the finite set {(i, —i), but nevertheless x and y are not inde- 

pendent (since i is not definable). 

To avoid similar situations, we impose an extra condition on a structure, 
again, well known in model theory mm. We say that A4 has definable Skolem 
functions if for every formula (p{x, y) there exists a definable function ftp{x) with 
the property that Ai |= Vx {3y ip{x,y) — >■ ip(x, f^{x))). In other words, /,^(a) is 
an element of ip{a,Ai), assuming (p(a,Ai) is not empty. We say that a Skolem 
function f^ is invariant CHI, if ip(di,M) = (f{a 2 ,M) implies fip{di) = f^{a 2 ). 
If the existence of such a Skolem function can be guaranteed for every (f, we say 
that Ai has definable invariant Skolem functions. 

Theorem 1. Assume that Ai has the following properties: 

(a) its theory is decidable; 

(b) Ai has effective test for algebraicity; and 

(c) Ai has definable invariant Skolem functions. 

Then the variable partition and independence problems are decidable over Ai. 

Proof sketch. We consider the case of two block partitions; that is, deciding if a 
formula <p{x, y) respects the partition P with blocks x and y. Let x have length 
n and yhave length 1. Define an equivalence relation on t/" by 

01 = 02 iff ip(di,Ai} = (p(d2,Ai). 



Lemma 2. For ip, P and = as above, ip P iff = das finitely many equiva- 
lence classes. □ 

Using this and the assumptions on Ai, we show how to define a formula x(x) 
finding a set of representatives of the equivalence classes of =; then again using 
the assumptions on Ai we show that it is decidable if xi-^) is finite. □ 

The proof of TheoremQgives an explicit construction for a formula witnessing 
p P, where P has two blocks. We now extend it to arbitrary partitions. 

Let p(xi, . . . , Xn) be given, and let B C {!,...,«}. Let card{B) = k. For 
o G U^, by pB{d, Ai) we denote the set of 6 G such that p(c) holds, where 
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c is obtained from a and b by putting their elements in the appropriate position, 
a being in the positions specified by B. For example, if n = 4, B = {2,4}, and 
a= ( 01 , 02 ), b= ( 61 , 62 ), then c is ( 61 , 01 , 62 , 02 ). Formally, for i € [l,n], let ki 
be the number oi j G B with j < i, and k 2 be the number oi j ^ B with j < i. 
Then Ci is o^j if i G B, and 6 ^ 3 , if i ^ B. 

We use the notation 

oi 3,2 iff ipBi{ai,M) = ipBi{a2,M). 

We now obtain the following characterization of VP^((^, P). 

Corollary 1. Let M be as in Theorem^ and let (p{xi, . . . ,x„) and a partition 
P — {Bi, . . . , Bm) on {1, ... ,n} be given. Then: 

1. For each i < m, it is decidable if the equivalence relation =^. has finitely 
many equivalence classes. Furthermore, ip P iff each =g. has finitely 
many classes. 

2. If ip Pj then one can further effectively find integers Ni, . . . ,Nm > 0 
and formulae at{xBf), i = 1 , . . . ,m, j = 1 , . . . ,Ni, such that ='^_ has Ni 
equivalence classes, which are definable by the formulae aj{xBi), j < Ni. 
Furthermore, 

M h '3x[ip{x)G^ \J aji(^Bi) A ... Aa™(xs„)) 

(il,... 

where 

K I A4 1= 3x (a]^ (fs J A ... Ao}^(xb^) A (/^(x))}. 

4 Decidability for Specific Classes of Constraints 

The general decidability result can be applied to a variety of structures, most 
notably, those that we listed earlier as the ones particularly relevant to constraint 
database applications (especially to spatial and temporal databases). In fact, the 
problem will be shown to be decidable for linear constraints over the rationale and 
the reals (this corresponds to structures (Q, +, — , 0 , 1 , <) and Rim), polynomial 
constraints over the reals (R), and linear repeating points US] (2o). 

4.1 Constraints on the Integers 

Here the result follows easily form Theorem d 

Proposition 2. Let Ai be (N, <,...) or (Z, <, . . .), and let its theory be decid- 
able. Assume, in the latter case, that there is at least one definable constant in 
M. Then the variable partition and independence problems are decidable over 
M. □ 



Corollary 2. The variable partition problem is decidable over Zq 

(Z, +, — , 0, 1, <, (=/c)fe>i). 



□ 
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4.2 Linear and Polynomial Constraints over the Reals 

The linear constraints over the reals (corresponding to the structure Run = 
(R, , 0, 1, <)) and the polynomial constraints over the reals (corresponding 
to R = (R, +,-,0, 1, <)) are the most useful constraints for spatial and spatio- 
temporal applications, where the problem of variable independence originated, 
and where variable independence is used in system prototypes. We thus concen- 
trate on these constraints. 

In many cases, however, we can state the results in greater generality using 
the concept of o-minimality (cf. section Ej) . 

It is known that every o-minimal expansion of the Run has definable invariant 
Skolem functions UBEnj. Since every definable subset of f7 is a finite union of 
points and open intervals, one can test algebraicity, assuming that the order is 
dense: given <p(x), the sentence 3u3uVa; {u < x < v ^ tests if p{M) is 

infinite. This shows 

Corollary 3. Let M. = (R, -I-, 0, 1, <, . . .) he o-minimal, and have a decidable 
theory. Then the variable partition and independence problems are decidable over 
M.. In particular, these problems are decidable over Rii„ and R. □ 

Since (Q, -b, — , 0, 1, <) is elementarily equivalent to Run, we conclude that 
the variable partition problem is decidable over it, too. 



Uniform Decidability and Complexity Bounds Our next goal is to present 
a uniform procedure for solving the problem VI^(:/?, P). More precisely, we say 
that the variable independence problem is uniformly decidable over A4 if the 
theory of is decidable, and for every partition P on {1, . . . , n}, there exists a 
single sentence in the language of Af expanded with an n-ary relation symbol 
S such that for any formula p{xi, . . . , a;„), 

T'-^mP iff {M,if{M))\=<Tp. 

Here (At, is the expansion of Af where the new symbol S is interpreted 

as {a I Af ^ V^(«)}- Note that the decidability of the theory of A1 implies that 
(M,ip(M)) ^ d>p is decidable. 

Proposition 3. Let At = (R, -b, 0, 1, <, . . .) be o-minimal and have a decid- 
able theory. Then the variable independence problem and partition problems are 
uniformly decidable over M.. 

Proof sketch. We show explicitly how to construct invariant Skolem function for 
a given equivalence relation. Given a (definable) set Y of representatives of a 
definable equivalence relation, its finiteness is tested as follows: Let X be the 
set of all coordinates of elements of Y. Since this is a definable set, it is finite 
iff it does not contain an open interval (by o-minimality) . This condition can be 
tested by a sentence in the language of A1 . □ 
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Proposition 0 implies that the variable independence problem is uniformly 
decidable over Rii„ and R. The main application of this result is in establishing 
complexity bounds. 

Since R admits quantifier elimination, every semi-algebraic set is given by a 
Boolean combination of polynomial inequalities. Thus, a standard way to repre- 
sent a semi-algebraic set in R-mm is by specifying a collection of polynomials 
Pi, . . . ,Pk S Zfyi, . . . , Xn], and defining a set X as a Boolean combination of sets 
of the form {a \ Pi{a) 6 0}, where 6 is either = or >. Here Zfyi, . . . , a;„], as usual, 
is the set of all polynomials in n variables with coefficients from Z. One can use 
coefficients from Q as well, but this would not affect the class of definable sets. 

Thus, when we study complexity of VPr((^,P), we assume that ip is given 
a Boolean combination of polynomial equalities and inequalities, with all poly- 
nomials having integer coefficients. The size of the input formula is then defined 
in a standard way, assuming that all integer coefficients are given in binary. All 
the above applies to semi-linear sets (that is, sets definable over Run); we just 
restrict our attention to polynomials of degree 1. 

Corollary 4. Let A4 be Rii„ or R. Let P be a fixed partition on {!,..., n}. 
Then, for a semi- algebraic (semi-linear) set given by a Boolean combination 
p{x) of polynomial inequalities (of degree 1), the problem P) solvable 

in time polynomial in the size of p. 

Proof sketch. This follows from Proposition |3 and complexity bounds on quan- 
tifier elimination [Tl^ . □ 

Another reason to consider the uniform decision procedure for variable inde- 
pendence is that it gives us a test for variable independence under some transfor- 
mations. For example, linear coordinate change in general would destroy variable 
independence, although it has relatively little effect on shapes on objects in M". 
Consider, for example, the following version of the variable independence prob- 
lem lN\{X,Xi,Xj): Given a semi-algebraic set X C M” (defined by a formula 
over R), is there a linear change of coordinates such that in the new coordinate 
system, variables Xi and Xj are independent? 

The general decision procedure of Theorem ^ does not give us a decision 
procedure for LVI. However, using uniformity, we easily obtain: 

Corollary 5. The problem UVl{X,Xi,Xj) is decidable. □ 

It turns out that not only the decision part of VIx(</5, P) and VP^((/?, P) 
can be solved in polynomial time for a fixed P over Rii„ and R, but there is also 
a polynomial time algorithm for finding a formula equivalent to p that witnesses 
If P. 

Theorem 2. 1. Given n > 1, and a partition P — (Pi, . . . , P^) on {1, ... , n}, 

there exists an algorithm that, for every semi-algebraic set given by a for- 
mula ip{xi, . . . ,Xn) which is a Boolean combination of polynomial equali- 
ties and inequalities, tests if p P, ond in the case of the positive an- 
swer, computes quantifier-free formulae a){xBi) such that each a){xBi) is 
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a Boolean combination of polynomial (in) equalities (where polynomials de- 
pend only on xbi o,nd all coefficients are integers), and (p(x) is equivalent to 
\J j /\j Moreover the algorithm works in time polynomial in the size 

of '•P- 

2. The same statement is true when on replaces semi-algebraic by semi-linear, 
and all polynomials are of degree 1. 

Proof combines Corollary H uniform decidability (Proposition EJ , complexity 
bounds for quantifier elimination [TEnj and, for 1), algorithms for polynomial 
root isolation □ 

In the full version, we also consider the most typical case of spatio-temporal 
applications: sets in given by formulae ip{x,y,t), where x,y describe the 
spatial component and t describes the temporal component. For this case, we 
present an algorithm based on cylindrical algebraic decomposition Pj for faster 
testing of variable independence. 

5 Conclusion 

We looked at the problem of deciding, for a set represented by a collection of 
constraints, whether some variables in those constraints are independent of each 
other. Knowing this can considerably improve the running time of several con- 
straint processing algorithms, in particular, quantifier elimination. The problem 
originated in the field of spatio-temporal databases represented by constraints 
(linear or polynomial over the reals, for example); it was demonstrated that on 
large datasets, reasonable performance can only be achieved if variables comprise 
small independent groups. It had not been known, however, if such independence 
conditions are decidable. 

Here we showed that these conditions are decidable for a large class of con- 
straints, including those relevant to spatial and temporal applications. Moreover, 
for linear and polynomial constraints over the reals, we gave a uniform decision 
procedure that implies tractability, and we showed that a given constraint set 
can be converted into one in a nice shape in polynomial time, too. 
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Abstract. Many natural combinatorial problems can be expressed as 
constraint satisfaction problems. This class of problems is known to be 
NP-complete in general, but certain restrictions on the form of the con- 
straints can ensure tractability. In this paper we show that any restricted 
set of constraint types can be associated with a finite universal algebra. 
We explore how the computational complexity of a restricted constraint 
satisfaction problem is connected to properties of the corresponding al- 
gebra. Using these results we exhibit a common structural property of all 
known intractable constraint satisfaction problems. Finally, we classify 
all finite strictly simple surjective algebras with respect to tractability. 
The result is a dichotomy theorem which significantly generalises Schae- 
fer’s dichotomy for the Generalised Satisfiability problem. 



1 Introduction 



The constraint satisfaction problem provides a framework in which it is possible 
to express, in a natural way, a wide variety of combinatorial problems 
The aim in a constraint satisfaction problem is to find an assignment of values 
to a given set of variables, subject to constraints on the values which can be 
assigned simultaneously to certain specified subsets of the variables. 

The mathematical framework used to describe constraint satisfaction prob- 
lems has strong links with several other areas of computer science and mathe- 
matics. For example, links with relational database theory m and with some 
notions of group theory |3j and universal algebra 0 have been investigated. 
There is a survey of these results in HS|. 

The constraint satisfaction problem is known to be NP-complete in gen- 
eral m However, certain restrictions on the form of the constraints can ensure 
tractability j.'-itbl/IDimj . In |2j, it is proved that the time complexity of a prob- 
lem involving a restricted set of constraint relations is determined by certain 
algebraic invariance properties of the constraint relations: namely, by the clone 
of polymorphisms of these relations. This result transforms the search for new 
tractable constraint types to the search for clones leading to tractability. 
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In this paper we extend this result by exploring further the fundamental 
connection between constraint relations, clones of operations, and finite algebras. 
The motivation for this research is that by characterising sets of relations in 
terms of their algebraic properties we can use powerful algebraic tools to analyse 
the complexity of the corresponding problems. Furthermore, in many cases the 
algebraic characterisation is simpler than an explicit description. As an example 
of the power of the algebraic approach we show that Schaefer’s theorem ini 
concerning the complexity of the Generalised Satisfiability problem can 
be expressed very simply as a classification of two-element algebras with respect 
to tractability. 

The paper is organised as follows. In Section 2 we give basic definitions 
relating to the constraint satissfaction problem, clones of polymorphisms, and 
finite algebras, and we define the notion of a tractable algebra. In Section 3 we 
prove that, for our purposes, it suffices to consider idempotent algebras, that 
is, algebras whose unary term operations are trivial. We also study how the 
tractability of a finite algebra relates to the tractability of its smaller derived 
algebras. In Section 4 we use the results obtained to classify all strictly simple 
surjective algebras. We show that such algebras give rise to constraint satisfaction 
problems which are either tractable or NP-complete. Thus we extend Schaefer’s 
dichotomy theorem to a much larger class of problems. 



2 Definitions 

2.1 Constraint Satisfaction Problems 

The central notion in the study of constraints and constraint satisfaction prob- 
lems is the notion of a relation. An n-ary relation on A is a subset of the set A", 
where A" denotes the set of all n-tuples of elements of A. The set of all Unitary 
relations on A is denoted by Ra- 

We now define the standard constraint satisfaction problem which has been 
widely studied jiliUll 1112114) . 

Definition 1. The constraint satisfaction problem (CSP) is the eomhinatorial 
decision problem with 

Instance: a triple (V,A,C) where V is a set o/ variables; A is a domain of 
values; and C is a set o/ constraints, {C \, . . . , Cq}. 

Each constraint Ci G C is a pair (si,pi), where Si is a tuple of variables of 
length rui, called the constraint scope, and pi is an mi-ary relation on A, 
called the constraint relation. 

Question: does there exist a solution, i.e. a function /, from V to A, such 
that for each constraint (si,pi) G C, with Si = {xi^, . . . ,Xi^), the tuple 
{f{xii),---,f{xi^)) belongs to pi? 
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Example 1. An instance of the standard propositional Satisfiability prob- 
lem 0 is specified by giving a formula in propositional logic, (that is, a con- 
junction of clauses), and asking whether there are values for the variables which 
make the formula true. 

Suppose that <^ = Ai A • • • A A„ is such a formula, where the Xi are clauses. 
The satisfiability question for (f) can be expressed as the instance (y,{0,l},C) 
of CSP, where V is the set of all variables appearing in the clauses X^, and 
C — {(si, pi), . . . , (s„, Pn)}- The constraints {sk, Pk) are constructed as follows. 

For every clause Xk, where x\, . . . are the variables appearing in Xk, 
let Sk = {x\, . . . , x^jj and pk = {0, 1}A \ {(oi, . . . , aj^)} where a* = 0 if is 
negated and = 1 otherwise (i.e., pk consists of all jfc-tuples that make Xk 
true). 

It is easy to show that solutions of this CSP instance are exactly assignments 
which make the formula (j) true. Hence, any instance of Satisfiability can be 
expressed as an instance of CSP. 



Example 2. An instance of Graph Unreachability consists of a graph G = 
(V, A) and a pair of vertices, v,w . The question is whether there is no path 
in G from v to w. This can expressed as the CSP instance (U, {0, 1},C) where 

C = {(e, {(0, 0), (1, l)})|e e A} U {((u), {(0)}), ((rc), {(!)})}. 

A number of other examples of well-known problems expressed as CSPs can be 
found in |S|. 

The general constraint satisfaction problem is known to be NP-complete (as 
seen from Example^. However, certain restrictions may affect the complexity of 
CSP. One of the possible natural ways to restrict CSP is to limit the relations 
which can appear in constraints. 

Definition 2. Let E be a subset of R^. Denote by CSP(T) the subclass o/CSP 
defined by the following property: any constraint relation in any instance must 
belong to E. 



Example 3. An instance of Not- All-Equal Satisfiability ^ consists of a 
collection of ternary clauses. The question is whether there is an assignment of 
values to the variables such that the variables in each clause do not all receive 
the same value. 

This problem can be expressed as the problem CSP({N}) where N is the 
following ternary relation on {0, 1}: 

N = {(a, b, c) I {a, 6, c} = {0, 1}}. 



Example 4- An instance of Graph q-Golourability consists of a graph G. 
The question is whether the vertices of G can be labelled with q colours so that 
adjacent vertices are assigned different colours. 
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This problem can be expressed as the problem CSP({^s}) where i? is a 
g-element set (of colours) and s is the disequality relation on B, defined by 

7 ^b = {(a, 6) G I a yf h}. 

It is well known (see that this problem belongs to P if g < 2, and is NP- 
complete otherwise. 

For A = {0,1}, the time complexity of CSP(T) was completely classified by 
Schaefer H3- In particular, he proved that such a problem is either tractable 
(i.e., solvable in polynomial time) or NP-complete. The classification problem 
for larger domains is still open and seems to be very interesting and hard |2|. 

Problem 1. Characterise all sets of relations P such that CSP(T) is tractable. 



2.2 Operations and Clones 

By an n-ary operation on a set A we mean any mapping / : A" — >■ A. The set 
of all finitary operations on A is denoted by Oa- An operation / which satisfies 
an identity of the form /(xi, . . . , x„) = Xi is called a projection. 

Definition 3 (U§1)- A set C C Oa is called a clone on A if 

— C contains all projections; and 

— C is closed under composition. 

Suppose f{xi, . . . ,Xn) G Oa and let M be an m x n matrix whose entries 
are elements from A. Define 



f{M) 



^ /(an, • . • , oi„) 

V/( ^ml ; • ■ - j ^mn) 



Definition 4 ( | |13il8p . An n-ary operation f G Oa preserves an m-ary rela- 
tion p G Ra (or f is a polymorphism of p, or p is invariant under f ) if for any 
m X n matrix M whose columns belong to p, the column f{M) belongs to p as 
well. 

For given F C Oa and P C Ra, let 

Inv{F) = (p G Ra \ P is invariant under each operation from F}; 

Pol{F) = {/ G Oa I / is a polymorphism of each relation from F}. 



Example 5. Let A = {0, 1, . . . ,p— 1}, for some prime p, and let / be the ternary 
operation on A given by f{x, y,z) = x — y -\- z mod p. The set Inv{{f}) consists 
of all relations which are solutions to some set of linear equations modulo p. 
Hence, CSP(/nu({/})) is tractable. 



276 



A. A. Bulatov, A. A. Krokhin, and P. Jeavons 



Jeavons has proved a strong theorem connecting the complexity of CSP(/^) to 
the clone of operations Pol{r). 

Theorem 1 ([9]). Let A he a finite set, and T, Iq ^ Ra- If Po is finite and 
Pol{P) C Pol{rf) then CSP(/o) is reducible in polynomial time to CSP(T). 

Informally speaking, Theorem d says that the complexity of CSP(T) is deter- 
mined by the clone Pol{P). Hence it allows us to tackle problems concerning the 
complexity of constraint satisfaction problems using algebraic techniques. 

Example 6. When A = {0, 1} all possible clones were fully described by Post finj . 
Using Theorem d together with this description of the possible clones, it is 
possible to classify the complexity of any set of Boolean relations 0 . It turns out 
that in this case there are just 6 maximal tractable sets of relations corresponding 
to 6 distinct clones. For any set of relations P which is not contained in one of 
these 6 maximal sets, the corresponding problem CSP(T) is NP-complete. 

This algebraic approach therefore provides an alternative proof of Schaefer’s 
well-known dichotomy result ini concerning the complexity of the Generalised 
Satisfiability problem (for details see 0)- 

2.3 Algebras 

In this subsection we first give the basic definitions of an algebra and certain 
standard algebraic constructions (following m and CHI). We then introduce the 
notion of a “tractable algebra” , and show how this relates to the tractability of 
restricted constraint satisfaction problems. 

Definition 5. An algebra is an ordered pair A = {A, F) such that A is a 
nonempty set and F is a family of finitary operations on A. The set A is called 
the universe of A; the operations from F are calledhasic. An algebra with a finite 
universe is referred to as a finite algebra. 

Definition 6. Let A — (A,F) be an algebra. The m-th direct power of A is the 
algebra A™ = {A"^,F) where we treat each operation fi G F as acting on A"* 
component-wise, i.e. 

/l((ail; ■ ■ • ; aim), ■ ■ • ; j — 

• ■ • ) . . . , fii^aim, ■ ■ ■ 5 ■ 

Definition 7. Let A = {A, F) be an algebra and B a subset of A such that, for 

O'Uy fi G F, and for any &i, . . . , S B, we have /i(6i, . . . , G B. Then the 

algebra B = {B,F\b), where F\b consists of the restrictions of the operations fi 

to B, is called a subalgebra of A. 

Definition 8. The finite power subuniverse system of an algebra A is the set 
SPjBn(A), where 

SPfin(A) = {p \ p is a universe of a subalgebra of A'^ for some m > 1}. 
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Remark 1. SPfin{A) = Inv(F) C 

Definition 9. Let A = (A, F) be an algebra. The least clone on A containing 
F is called the clone of term operations of A and is denoted by Clo{A). Two 
algebras with the same universe are called term equivalent if they have the same 
clone of term operations. 

In other words, the clone Clo{A) consists of all operations that can be obtained 
from F and the projections by means of composition. 

Theorem 2 (|18|). For a finite algebra A, SPfin{A) is the greatest set F with 
Pol{F) = Clo{A). 

Corollary 1. If finite algebras Ai and A 2 are term equivalent then the equality 
SPfin(-4i) = 5Pfin(^2) holds. 

We are now in a position to link the complexity of a restricted constraint sat- 
isfaction problem CSP(T) to properties of a corresponding algebra. Given an 
arbitrary set of relations F C 7?^, consider the algebra Ar = {A, Pol{F)). By 
Theorem^ F is contained in SPuniAr)- By Theorem^ if CSP(T) is tractable, 
then CSP(/b) is tractable for any finite subset Fq of SPfin{Ar)- Hence to iden- 
tify sets of relations for which CSP(T) is tractable, it suffices to identify when 
the corresponding algebra, Ar is tractable, in the following sense. 

Definition 10. We call an al^bra A (globally) tractable if CSP {SPfin{ A)) is 
tractable, and locally tractable! if, for every finite Fq C SPfin{A), CSP(Io) is 
tractable. 

We call an algebra A NP-complete if CSP (SPfin{A)) is NP-complete. 

With this definition, our original problem of identifying all sets of relations for 
which CSP(T) is tractable has been reduced to the following: 

Problem 2. Characterise all tractable finite algebras. 

Schaefer’s result ^ (see Example 0) provides a first step towards answering 
this problem, because it yields a complete classification of two-element algebras 
with respect to tractability. 

Theorem 3. A two-element algebra A = ({0, 1}, F) is lAP-complete if it is term 
equivalent to an algebra ({0,1}, G) where G is a permutation group on {0,1}. 
Otherwise A is tractable. 

The central idea in the remainder of this paper is to generalise this result by using 
the algebraic properties of an arbitrary algebra. A, to determine the complexity 
of the associated constraint satisfaction problem, CSP(SPgn(-4)). 

^ We must distinguish the notions of global tractability and local tractability because 
there is no guarantee that the following situation cannot happen: for any finite 
subset Fq of some infinite set, F, of relations, there exists a particular polynomial 
algorithm solving CSP(rb), but there is no uniform polynomial algorithm for solving 
all instances of CSP(F). 
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3 Tractability and Algebraic Constructions 

In this section we first show that when studying the tractability of finite algebras 
we can restrict our attention to certain special classes of algebras. 

Definition 11. We call an algebra surjectiv^l if all of its term operations are 
surjective. 

It is easy to verify that a finite algebra is surjective if and only if its unary 
term operations form a permutation group. 

Definition 12 ( [2()J I. Let A = (A, F) be an algebra, and U a non-empty sub- 
set of A. The induced term algebra A\u is defined as {U,{CloA)\u) where 
{CloA)\ u = {g\u ■■ g € CloA and g preserves U}. 

It is easy to check that Clo{A\u) = {CloA)\u- 

Theorem 4. For any finite algebra A there exists a set U such that 

i) A\u is surjective; 

ii) If A is locally tractable then so is A\u; 

Hi) If A\u is (locally) tractable then so is A. 

Proof. Let A = {A, F) be a finite algebra, and let be a minimal subset of A 
which is the range of some unary operation / G CloA. 

Clearly, A\u is surjective by the choice of U. We now prove that it satisfies 
properties (ii) and (iii). 

(ii) Let T be a finite subset of SPgn(A|( 7 ). We will show that CSP(T) is 
linear-time reducible to CSP(C) where C is a suitable finite subset of 5Pfin(A). 

Let V = {V,U,C) be an instance of CSP(C). For any relation p G F, with 
arity m, we denote by p' the universe of the subalgebra of A™ generated by p, 
i.e., 

p' = {g(al, . . . ,ajj) \ k > 0, g G Clo{A) and of, . . . , oF G p}. 

Let F' = {p' \ p G T} and set V' = {V,A,C) where C = {(s,p') | (s, p) G C}. 
Now V' is a problem instance of CSP(F') which can be obtained from V in 
linear time. Since all the projections belong to Clo{A), we have p C p', so any 
solution of V is also a solution of P'. Conversely, suppose tjj is a, solution of P' . 
We shall prove that fip is a solution of P. For this, we show that f{p') C p. Let 
a G p' . Then /(a) = fgfaj, . . . , Ok) for some g G Clo{A) and some oT, . . . , oL G p. 
The operation fg{xi, . . . ,xt) preserves U and belongs to Clo{A)] therefore its 
restriction to U belongs to Clo{A)\u- Since p G SPfin(A|( 7 ), we get /(a) G p. 

(iii) We shall show that CSP(SPfin(A)) can be reduced in linear time to 
CSP(SPfin(A|c/)). First note that, by the choice of U, f\u is a permutation, 
so by iterating as necessary we obtain a unary operation f' G CloA such that 
/'(a) = a for all a in U. 



^ Note that in m an algebra is said to be surjective if all of its basic operations are 
surjective. Such algebras can have non-surjective term operations. 
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Let V = {V, A,C) be an instance of CSP(SPgn(„4)), where C = {(si, pi), . . . , 
(sq,Pg)}, and consider the triple V = {V,U,C'), where C' = {(si, /'(pi)), . . . , 
{sqj'lpq))} and f(pi) = {/'(a) | a G pj. 

We first prove that P' is an instance of CSP(SPgn(- 4 |( 7 )). Since the range 
of /' is U, any relation of the form f'{pi) can be viewed as a relation on U. 
We need to show that, for 1 < f < q, f'(pi) belongs to SPfin{A\u), he. for 
any g(xi, . . . , Xm) G Clo{A\u)^ and for any G f'{pi), we have a = 

g{ai, . . . G f{pi)- The operation g is the restriction of some g' G CloA 
to U such that g' preserves U. Suppose that a] = f'(bj) for some bj G pi, 
I < j < m. Then 



a = q(ai, ...,am) = p'(/'(&i), ■ ■ ■ , /'(&m))- 

We have a G Pi, because all elements bj belong to pi, and the function g'{f'{xi), 

■ ■ ■ ,f'{x7n)) G Clo{A). All components of a belong to U, since g' preserves U. 
Finally, we have a = f'{a) G f'{pi), since /'(a) = a for any a £ U. 

Since each pi G CSP(SPfin(A)), we have f'{pi) C pp, therefore any solution 
of P' is also a solution of P. Furthermore, if ^ is a solution of P then f'lp is 
a solution of P'. Clearly, P' can be obtained from P in linear time. Therefore 
tractability of A\u implies that A is tractable. Finally, local tractability of A\u 
implies local tractability of A, since if all the constraint relations in C are taken 
from some finite set P C SPa^iA) then every constraint relation in C' belongs 
to a finite subset {/'(p) | p G T} of SPfin{A\u)- □ 

The next theorem shows that we need only consider surjective algebras which 
are idempotent. 

Definition 13. An operation f on A is called idempotent if f{x, . . . ,x) = x for 
any x € A. The full idempotent reduct of an algebra A = {A, F) is the algebra 
{A, Cloid{A)) where Cloid{A) consists of all idempotent operations from Clo{A). 



Theorem 5. A finite surjective algebra A is locally tractable if and only if its 
full idempotent reduct is locally tractable. 

Proof. Omitted. See the full version of this paper | 2 |. 

The next two theorems connect the tractability of a finite algebra with the 
tractability of its subalgebras and homomorphic images. 

Definition 14. Let Ai = (Ai,Fi) and A2 = (^2,^2), where F\ = (/I | i G /) 
and F2 = {f 1 \ i £ I). A map {p \ A\ ^ A2 is called a homomorphism from 
Ai to A2 if <p //(«!, . ■ • , CLm) = fiiTi'^i)^ • ■ • ! Ti^rii)) holds for all i £ I and all 
ai,...,am £ A\. If the map is onto then A2 is said to be a homomorphic 
image of A\. A homomorphic image of a subalgebra of an algebra A is called a 
factor of A. 
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Theorem 6. Let A be a finite algebra. Then 

(i) if A is (loeally) traetable then so is every subalgebra of A; 

(ii) if A has an ~NP -eomplete subalgebra then A is 'NP-eomplete itself. 

Proof. Let A' be a subalgebra of A. It is easy to check that SPf}n(^') C 5Pf}„(Al). 
Hence, CSP(SPfin(-4')) can be reduced to CSP(5Pfin(-d)) in constant time. 
Now (i) and (ii) follow immediately from this reduction. □ 

Theorem 7. If a finite algebra A is locally tractable then so is every homomor- 
phic image of A. 

Proof. Let be a homomorphic image of A and let tp be the corresponding 
homomorphism. We will show that for any finite P C SPg„(S), CSP(T) is 
linear-time reducible to CSP(T') for some finite P' C SPfj„(Al). 

For p G SPfin{B), set p~^{p) = {a | (/?(a) G p} where p acts component-wise. 
It is clear that p~^{p) is a relation of the same arity as p. It is straightforward 
to check that p~^{p) G SP^niA). Let P' = {p~^{p) \ p G P}. Then P' is a finite 
subset of SPfin(Al). 

Take an instance V = (V,B,C) of CSP(T) and construct the instance V' = 
(V,A,C') of CSP(r) where C' = {{s,p-^{p)) \ (s,p) G C}. 

If i/) is a solution of V' then pif is a solution of V. Conversely, if ^ is a solution 
of V, then any function ip : V ^ A such that pip(y) = ^(v) for any v G V is a 
solution oiV' . □ 

Corollary 2. If A is a locally tractable finite algebra then so is every factor of 

A. 

The following corollary from Theorems El and □ seems to be quite significant, 
since all known forms of NP-complete CSPs (see [I dll /j i can be obtained using 
it. An algebra {A, G) where G is a permutation group on A is said to be a G-set. 
A G-set is said to be non-trivial if its universe is not a singleton. 

Corollary 3. A finite algebra A is BiP -complete if it has a factor which is term 
equivalent to a non-trivial G-set. 

Proof. Denote by A' a subalgebra of A such that is a homomorphic image of 
A', and let B be the universe of B. 

First, consider the case when B contains only two elements, say 0 and 1. Then 
the relation N of Example Elbelongs to SPfin{B) = Inv{G). It follows from the 
proof of TheoremQthat CSP({A^}) is reducible in polynomial time to CSP(F') 
for some finite P' C SPfin(Al')- Since Not- All-Equal Satisfiability which 
corresponds to CSP({A^}) is NP-complete jlZj, we see that A' is also NP- 
complete. Applying Theorem El we conclude that A is NP-complete. 

Now consider the case when \B\ > 3. Then the disequality relation of 
Example 0 belongs to SPfin(^)- Proceeding as in the previous case one can show 
that Graph |H|-Colourability (see Example EJ, which is NP-complete |3j, 
is reducible in polynomial time to CSP(SPgn(A)). Hence, by Theorem El A is 
NP-complete in this case also. □ 
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4 Strictly Simple Surjective Algebras 

Using the results of the previous section, we now consider certain special algebras 
all of whose smaller factors are tractable. 

Definition 15. A finite algebra is called strictly simpl^ if all of its smaller 
factors contain one element. 

All finite strictly simple surjective algebras have been described by Szendrei m- 
Using this powerful algebraic result we are able to classify the complexity of all 
such algebras. 

Theorem 8. A finite strictly simple surjective algebra A is NP -complete if it 
is term equivalent to a non-trivial G-set. Otherwise A is tractable. 

Proof. Omitted. See the full version of the paper |2j. 



Example 7. Let A = {0, 1, ... , m}, choose k > 2, and let 
A = {Ro, Rk, {(0)}, {(!)}, . . . , {(to)}}, where 

i?o = {(0, 0), (1, 2), (2, 3), . . . , (m - 1, m), (m, 1)}; 

Rk = {(oi, ■ . ■ , Ofc) € I Oj = 0 for at least one i, 1 <i < k}. 

Using Szendrei ’s classification m. it can be shown that Ar^ is a strictly simple 
surjective algebra, for every k. Furthermore, Apk contains essentially non-unary 
operations, and so is not term-equivalent to a G-set. Hence, by Theorem 0 
CSP(A) is tractable. 

Theorem 0 is a dichotomy result: it shows that any finite strictly simple sur- 
jective algebra is either tractable or NP-complete. Note that Schaefer’s result 
for the Generalised Satisfiability problem (Theorem |3) is a special case of 
this result. Whether there is such a dichotomy in general is possibly the most 
intriguing open question in this area. 

Problem 3. Is every finite algebra either tractable or NP-complete? 

The results obtained via the algebraic approach and, in particular, the results 
obtained in this paper, prompt us to make the following conjecture. 

Conjecture 1. A finite algebra is NP-complete if it has a factor which is term 
equivalent to a non-trivial G-set. Otherwise it is tractable. 
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Abstract. An online algorithm for variable-sized bin packing, based on 
the Harmonic algorithm of Lee and Lee EH, is investigated. This al- 
gorithm is based on one proposed by Csirik 0. It is shown that the 
algorithm is optimal among those which use bounded space. An interest- 
ing feature of the analysis is that, although it is shown that our algorithm 
achieves a performance ratio arbitrarily close to the optimum value, it is 
not known precisely what that value is. The case where bins of capacity 
1 and a G (0, 1) are used is studied in greater detail. It is shown that 
among algorithms which are allowed to chose a, the optimal performance 
ratio lies in [1.37530, 1.37532]. 

Track A Keywords: Bin packing. Online algorithms. Lower bounds. 



1 Introduction 

Bin packing is one of the oldest and most well studied problems in computer 
science m Ideas which originated in the study of the bin packing problem 
have helped shape computer science as we know it today. The idea of finding 
the best approximation algorithm for a problem has its origins in bin packing. 
The study of online algorithms also has its roots in the study of bin packing. In 
this paper, we investigate a natural generalization of the classical bin packing 
problem known as variable-sized bin packing. 

In the classical bin packing problem, we receive a sequence a of pieces pi, 
P2, ■ • • , Pn- Each piece has a fixed size in (0, 1]. In a slight abuse of notation, we 
use Pi to indicate both the fth piece and its size. The usage should be obvious 
from the context. We have an infinite number of bins each with capacity 1. Each 
piece must be assigned to a bin. Further, the sum of the sizes of the pieces 
assigned to any bin may not exceed its capacity. A bin is empty if no piece is 
assigned to it, otherwise it is used. The goal is to minimize the number of bins 
used. 

The variable-sized bin packing problem differs from the classical one in that 
bins do not all have the same capacity. There are an infinite number of bins of 
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each capacity ai < a 2 < ■ • ■ < a^n = 1- The goal now is to minimize the sum of 
the capacities of the bins used. 

In the online versions of these problems, each piece must be assigned in 
turn, without knowledge of the next pieces. Since it is impossible in general to 
produce the best possible solution when computation occurs online, we consider 
approximation algorithms. Basically, we want to find an algorithm which incurs 
a cost which is within a constant factor of the minimum possible cost, no matter 
what the input is. This constant factor is known as the asymptotic performance 
ratio. 

A bin-packing algorithm uses bounded space if it has only a constant number 
of bins available to accept items at any point in time. These bins are called open 
bins. Bins which have already accepted some items, but which the algorithm no 
longer considers for packing are closed bins. The bounded space assumption is 
a quite natural one, especially so in online bin packing. Essentially the bounded 
space restriction guarantees that output of packed bins is steady, that the packer 
does not accumulate an enormous backlog of bins which are only output at the 
end of processing. In many applications this is a quite desirable feature in an 
algorithm. 

We define the asymptotic performance ratio more precisely. For a given input 
sequence cr, let cost^(cr) be the sum of the capacities of bins used by algorithm 
A on a. Let cost((r) be the minimum possible cost to pack pieces in cr. The 
asymptotic performance ratio for an algorithm A is defined to be 



Our goal is to find an algorithm with asymptotic performance ratio close to 



We now briefly review what is known about classical and variable-sized online 
bin-packing. 

The classical online bin packing problem was first investigated by John- 
son I7IB|. He showed that the Next Fit algorithm has performance ratio 2. 
Subsequently, it was shown by Johnson, Demers, Ullman, Garey and Graham 
that the First Fit algorithm has performance ratio ^ 0. Yao showed that 
Revised First Fit has performance ratio |, and further showed that no online 
algorithm has performance ratio less than | m- Brown and Liang independently 
improved this lower bound to 1.53635 [ 1 1 1 2j . Define 




The optimal asymptotic performance ratio is defined to be 



= inf 
A 



Ui+i = Ui{ui - 1) -I- 1, Ui= 2, 



and 
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Lee and Lee showed that the Harmonic algorithm, which uses bounded space, 
achieves a performance ratio arbitrarily close to hao El They further showed 
that no bounded space online algorithm achieves a performance ratio less than 
/loo El- In addition, they developed the Refined Harmonic algorithm, which 
they showed to have a performance ratio of ||| < 1.63597. The next improve- 
ment was Modified Harmonic, which Ramanan, Brown, Lee and Lee showed 
to have a performance ratio of ||| < 1.61562 m The best claimed result to 
date is that of Richey m- He presents an algorithm called Harmonic-I-1, and 
gives an argument that it has performance ratio 1.58872. In the author’s opin- 
ion, Richey’s argument lacks the rigor of a proof. The lower bound for online bin 
packing was improved to 1.5401 by van Vliet PSl- 

The variable-sized bin-packing problem was first investigated by Frieson and 
Langston jS| . Kinnerly and Langston gave an online algorithm with performance 
ratio I HD]. Csirik proposed the Variable Harmonic algorithm, and showed 
that it has performance ratio at most hoo This algorithm is based on the 
Harmonic algorithm of Lee and Lee El- Like Harmonic, it uses bounded 
space. Csirik also showed that if the algorithm has two bin sizes 1 and a < 1, 
and that if it is allowed to pick a, then a performance ratio of | is possible 0 . 
Subsequent authors have investigated this problem but Csirik’s result 

yields the best performance ratio to date. No lower bounds are known. 

In this work, we investigate the Variable Harmonic (VH) algorithm. We 
give a precise analysis of its performance ratio, and show that it is an optimal 
bounded space algorithm, by providing the first lower bound for variable-sized 
bin packing. An interesting feature of the analysis is that, although it is shown 
that our algorithm achieves a performance ratio arbitrarily close to the optimum 
value, it is not known precisely what that value is. The case where bins of capacity 
1 and a G (0, 1) are used is studied in greater detail. It is shown that among 
algorithms which are allowed to chose a, the optimal performance ratio (and 
that of VH) lies in [1.37530, 1.37532]. 



2 The Variable Harmonic Algorithm 



Before we describe the algorithm we require a few definitions. 

The algorithm uses as a subroutine the Next Fit algorithm m This online 
algorithm maintains a single open bin. If the current item fits into the open bin, it 
is placed there. Otherwise, the open bin is closed and a new open bin is allocated. 
Obviously, this algorithm is online, runs in linear time and uses constant space. 

VH operates by classifying pieces according to a set of predefined intervals. 
The algorithm has a parameter e G (0, 1]. N = {0, 1,2,.. .} is the set of natural 
numbers and N'*' = N — {0}. We define 




j G > je 



t=\Jt,. 
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Let the members of T be ti > ^2 > • • ' > ^n- We define = e and t „+2 = 0. 
The interval Ij is defined to be for j = 1, . . . ,n + 1. Note that these 

intervals are disjoint and that they cover (0, 1]. 

A piece of size s has type j if s € Ij. The class and order of interval Ik 
are i and j, respectively, if tk = otij j (breaking ties arbitrarily). We extend the 
definitions of class and order to include pieces in the natural way. A piece is big 
if it has type i < n. 

The algorithm packs pieces of different types independently. I.e. pieces of 
differing types never appear in the same bin. Pieces of type n + 1 are packed 
using Next Fit into bins of capacity 1. Pieces of class i and order j are packed j 
to a bin in bins of capacity ai . The algorithm keeps one open bin for each type, 
into which pieces are packed until the indicated number is reached, at which 
point it is closed and a new open bin is allocated. Since the number of types 
(and thus the number of open bins) is constant, the algorithm is online, runs in 
linear time and uses constant space. Note that when m = 1 the definition of VH 
corresponds exactly with that of Harmonic. 

3 An Upper Bound for VH 

The analysis is based on weighting functions as in nm. A weighting function 
for algorithm A is a function : (0, 1] H> [0, 1] with the property 

N 

COSU (cr) <'^Wa{Pi) + 0(1), 

for all input sequences pi, . . . ,pn. Intuitively, the weight of a piece indicates the 
maximum portion of a bin that it can occupy. We use the following function: 

( ti if X G li with i < n, 

R’VH(a;) = < 1 -f ^ T 

X It X e In+l- 



Lemma 1. For all a. 



N 

costvH(fT-) < ^ Wh(Pz) + ri + 1. 
2=1 



Proof. We consider first the cost of bins used to pack pieces of an arbitrary 
individual type. Next Fit is used to pack type n + 1 pieces. Each of these 
pieces is of size e or less. Therefore, each of these bins is filled to within e of its 
capacity. Let X be the total size of type n + 1 pieces. The number of bins used 
is at most 



1 - e 



-A 



< 



1-e 



X + 1 




Pie/„+i 



Y ^vh(k) + 1- 

PiGln + l 
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Consider now pieces of type i < n. Let j and k be the class and order of li, 
respectively. These pieces are packed fc to a bin in bins of capacity aj. Let £ be 
the number of them. Then the number of bins used is \£/k~\. The cost to the 
algorithm is 

l£ 



<— + l = c 



1 = ^ wvnfe) + 1. 



Summing over all types gives the desired result. 



□ 



Now consider the optimal packing. Let rii be the number of bins of capacity 
at in the optimal packing. Let Bij be the set of pieces in the jth bin of size ai. 

Suppose bin Bij in the optimal packing is not full. Let x = ■ P- 

Then add a piece of size ai — x to the end of our sequence. The cost of the 
optimal solution does not increase, whereas the cost to the online algorithm 
cannot decrease. Therefore, when upper bounding the performance ratio of an 
online algorithm, we may assume that each bin in the optimal packing is full. 

^From the preceding definitions we have 

m 

cost(cr) = ai ■ Ui. 

i=l 



To show that the algorithm has performance ratio at most c, we show that 

N , m 

WvuiPi) / ■ Ui < c, 

i=l ' i=l 



for all cr. 

We find 

N j m m Ui I m 

^ wvh(k) / X! = X! X! X! ^vh(p) ^a^-m 

i—\ ^ i—1 — ' i—1 

m rii I m Ui 

= EE E wvh(p) /EE«*- 

i=l j = i peBij ' i=l j=l 

For the above value to be greater than c, for some i and j we must have 

u>vh(p) > c - ai. 

P&Bij 

If we show that this is impossible, we show that the algorithm has performance 
ratio at most c. 

We are therefore led to consider the following optimization problem: Maxi- 

X] wv-R{p)/ai, 
pex 



mize 
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subject to P — i G {1, . . . , m}. We further rewrite this as: Maximize 



subject to 





y < a*, 

qj e N, 1 < j < n, 
i e 



( 1 ) 



( 2 ) 

( 3 ) 

( 4 ) 



where we define 

n 

y = ^<ij-^3+i- (5) 

i=i 

Intuitively, qj is the number of type j pieces, j/ is a lower bound on the sizes 
of the big pieces. Note that strict inequality is required in m because a type j 
piece is strictly larger than t^+i. We shall consider this mathematical program 
to be a function of e, which we denote as Vi{e). 

We have shown the following: 



Theorem 1. The performance ratio of VH is upper bounded by the value of 
We shall return to the evaluation of Vi{e) at a point later in our exposition. 



4 A Lower Bound for Bounded-Space Algorithms 

Consider the mathematical program: Maximize 



— I +a*-2/ I . 



( 6 ) 



subject to (EEJ. We denote this program as 7^2(e)- We have the following result: 



Lemma 2. Any feasible solution to V 2 {e) with objective value c yields a lower 
bound of c for all bounded space variable- sized bin packing algorithms. 

Proof. Let gi, . . . , z be a feasible solution and let c be the corresponding value 

of (EJ. Define Q = qj + 1- We create an input with N ■ Q pieces. Let i5 > 0 
be a real number. The input consists of n + 1 groups of pieces. All pieces in a 
group are the same size. The jth group contains qjN pieces of size tj+i + 6 for 
^ < j < n. The last group contains N pieces of size z = — y — {Q — 1)^. We 

require that <5 be chosen such that z >0 and tj+i 6 £ Ij for I < j < n. 
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The optimal solution uses N bins of size Each of these bins contains Qj 
items from group j for 1 < j < n and one item from the last group. 

Consider the number of bins used by an arbitrary bounded space online 
algorithm A on this input. Since A is online and uses bounded space, at most 
a constant number (with respect to N) of pieces in group j can be packed with 
pieces from other groups. Therefore, the cost to pack items in the last group is 
at least Nz — 0(1). Further, the minimum cost to pack the items in group j < n 
is 



OikqjN 

min ^ — 

l<k<m \ak/{tjj^i -I- 5)J 



0 ( 1 ) 



q,N 



min 

Kk<m 



Ofc 

\oik/{tj + i + 5)J 



0 ( 1 ). 



We assert that 



Oik 

mui 1 771 F7T > ti- 

l<k<m [ak/ (tj+1 + ^)J 



( 7 ) 



Given this fact, the asymptotic performance ratio of A is lower bounded by 



N (Sj=i tj ■qj + a^-y-{Q- 1)(5) - 0(1) 

N—yoo CXiN 



= c - (Q - 1)(5, 



which can be made arbitrarily close to c by choosing sufficiently small S. There- 
fore, to complete the proof we need merely show ( 0 . 

Suppose for a contradiction that (0 is false. Then there exists a k such that 






Let i be the integer [ak/{tj+i + 5)J . Note that since 



{tj+i + S)£ 

Oik 



tj-i-i -I- S 
Oik 



Oik 

tj+i + S 



< 1, 



we have ak/i > tj+i + S. Therefore we have ak/i € [tj+i + S, tj) C Ij. However, 
by the definition of Ij, we have no number of the form akj t, i an integer, within 
Ij. So we have reached a contradiction. □ 



Lemma 3. 



lim [value(Pi(e)) — value(P2(e))] = 0. 



Proof. Since 7^i(e) and 7^2 (e) share the same constraints, any feasible solution 
to one is a feasible solution to the other. Further, for any feasible solution, the 
value of m is greater than that of ® by 
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Consider the assignment to gi, . . . , i which maximizes Q. The value of 
at this feasible solution lower bounds the value of V2{f)- So we have 

lim [value(Pi(e)) — value(P2(e))l < lim — — = 0. 

£->0 e-^o 1 — e 



□ 



We are now ready to state the main result: 

Theorem 2. VH is an optimal hounded space online algorithm. 

Proof. This follows directly from Theorem [H and Lemmas 0 and 0 □ 



5 Evaluating Pi and P 2 



We present an algorithm to evaluate Pi{e) and P2(e)- Define y as in (0). Further 
define: 



tj 

= forl<j<n, 

D+i 

1 

e-n+l — J 

1 — e 

if,- = max Cfc for 1 < 7 < n + 1. 

j<k<n+l 



The algorithm to compute Vi{e) is displayed in Figures 0 and El By simply 



a: •«— 1. 

For i £ { 1 , . . . , m} do: 

Initialize qj <— 0 for 1 < j < n. 
Tryall(I). 

Return x. 



Fig. 1. The algorithm for computing Pi{e). 



redefining the value of e„+i to be 1, we compute instead 7^2 (e)- 

The main routine of the algorithm initializes variables, and iterates over the 
possible values of i. The variable x stores maximum objective value found at any 
point in the computation. 

The real work of the algorithm is done in the subroutine Tryall. This sub- 
routine traverses an implicit data structure which we call the feasible solution 
tree. This is a rooted tree with n + 1 levels. The edges of the tree are labeled 
with natural numbers. All leaves are at level n-l- 1. Along any path from the root 
to a leaf, the label of the jth edge represents the value of qj. Each such path 
specifies a feasible solution. Tryall recursively traverses the feasible solution 
tree, avoiding sub-trees which cannot improve on the best solution found so far. 
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^ ^ ^ (ELi <lh-tk + [ai - y)Ej) . 
z < X then: 

Return. 

Else if j = n -|- 1 then: 

X z. 

Else: 

qj ^ \{ai- y)/tj+i']. 

While qj > 0 do: 

<h ^ <lj - 1 - 
TRYALL(j -I- 1). 



Fig. 2. The subroutine Tryall(j'). 



During the execution of Tryall, j represents our current level in the tree. 
If j = n + 1, then we have reached a leaf. In this case, the value z assigned in 
the first step of the algorithm is exactly the objective value. When j < n, the 
value of z is an upper bound on any objective value in the current sub-tree. If 
this value does not exceed x then we do not explore this sub-tree. Otherwise, 
we recurse for each of the possible values of qj, from the largest down to zero. 
This heuristic drastically decreases the running time of the algorithm, as the 
first solution found is the greedy one, where by greedy we mean the largest 
possible item is added to the solution at each step. In fact there are several 
reasonable alternative defintions of greedy for the variable-size problem. The 
optimal solution is quite often greedy, however, there are some cases where the 
optimal solution is not any greedy one. We shall explain this in more detail in 
the full version. 

6 The Dual-Capacity Problem 

We now focus our attention on the case where m = 2, which we call the dual- 
capacity problem. In order to simplify notation, we denote the size of the smaller 
bin as a. The size of the larger bin is 1, as before. We investigate how the opti- 
mal asymptotic performance ratio varies as a function of a. We therefore 
consider the values of Vi and V 2 to be functions of a, as well as e. As we have 
shown in the preceding sections, P 2 (e, 0 ) < Q^)- 

Using the algorithm described in Section El we can compute a lower bound 
for any fixed value of a. However, what we would like to do is prove a lower 
bound which holds for all a. To further this goal, we study the structure of 
P 2 (e, o) in more detail. 

Since we only need to lower bound V 2 {c,q), for the discussion that follows 
we fix J = 2. We consider only e = 1/M for M S N’*’. 

Note that, as a varies, the values in T\ change, while the values in T2 are 
fixed. For certain values of a, it may happen that a point in T\ is coincident 
with one in T 2 - We call such a point an interesting point. At an interesting 
point, a combinatorial change occurs in the structure of since two 
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points exchange order. We also include zero as an interesting point. It is not 
hard to see that the set of interesting points is: 

M M u 

muUUf 

1=1 j=e 

Rename the points in this set to be 0 = si < S 2 < • • ■ < sl = 1. Define 
Sj = (sj_i,Sj]. We shall explain how to lower bound the value of V 2 {f-,OL) for 

Ot G . 

We further split each interval Sj into disjoint sub-intervals {ui,U 2 \, (m 2 ,U 3 ], 
. . . , (u£-i, U(\ with ui = Sj-i and Ui = Sj. Suppose that qi, . . . ,qn,i = 2 is a 
feasible solution to P 2 (e, a) at a = Uk- We assert that this is feasible solution for 
all a € {uk-i,Uk\- To see this, note that within y is a non-decreasing 

linear function of a, for any fixed assignment to gi, b Therefore, OSJ 

remains valid throughout (sj_i,Mfc] D {uk-i,Uk\ - Furthermore, the objective OSI) 
is also a linear function of a for any fixed assignment to gi, . . . , g„, i. 

To get a lower bound for (uk-i^Uk], we evaluate P 2 (e, Uk) (fixing i = 2). We 
record the assignment gi, . . . , g„ which gives the highest lower bound. Substitut- 
ing these values into ( 0 , we get a linear function. This is minimized at either 
a = Uk-i OT a = Uk- We evaluate the function at these two points to get the 
desired lower bound on {uk-i,Uk]- Iterating over all intervals and sub-intervals, 
we can compute a lower bound for all a. 

As an example, let e = This yields 33 interesting points and 32 intervals. 
For simplicity’s sake, we do not divide an interval into subintervals. We find that 



We get a lower bound of = 1.365 at a = We have combined adjacent 
intervals where possible. For example, we find that the lower bound function in 
both (|, ^] and (g, |] is |§ -I- |a, and so we have one entry for (|, |]. 

The main result of this section is: 

Theorem 3. 



1.37532 > 



395101163 . , „oo / ^ 78392621 

> inf > 

287280000 “ aG(o,i] ’ ~ 57000000 



> 1.37530. 



Proof. The value of Pi(l/100, 57143/80000) is 395101163/287280000. ^From 
Figure El we find that if P 2 (e,a) < 78392621/57000000 then we must have 
a S (7/10,3/4]. We determine the intervals Sj for e = 1/100, and eliminate 
those which do not overlap (7/10,3/4]. This leaves 154 intervals to be checked. 
These intervals were divided further into sub-intervals, by including all points 
7/10-1- j/100000 for 1 < j < 2500. We have verified using Mathematica that the 
minimum lower bound over this set of sub-intervals is 78392621/57000000. □ 

To get a better picture of the curve i?]^.p(a), we have computed values of 
Pi(Y^,a) and V 2 {j^,ct) for a = g/10000,1 < j < 10000, using Mathematica. 
The maximum difference between the two values at any of these points was less 
than 0.00014. Since the difference is so small, we display just Vi in Figure E] 






Iwra 



Low a High a Functio 


n 


0 1 71 

^ 7 42 




i i ^ +a 

7 6 21 ^ ^ 




1 2 71 

6 7 42 




2 3 283 

7 10 168 




3 1 32 1 a 

10 3 21 2 




1 3 17 2 c 

3 8 9 3 




3 2 49 

8 5 30 




2 3 M +0, 

5 7 21 ~ ^ 




3 4 4 1 2c^ 

7 9 3 3 




4 1 43 _|_ 4 c 

9 2 42 3 




1 5 31 4c 

2 9 15 5 




5 5 13 _ 

9 8 6 ^ 




5 2 7 a 

8 3 4 2 




3 10 12 ^ 




7 5 7 1 7c. 

10 7 8 10 




5 3 8 1 2c 

7 4 9 3 




3 4 53 1 2c 

4 5 60 3 




4 5 1 _L 17 

5 6 42 1 


c 

F 


5 6 4 1 3c 

6 7 21 2 




6 9 1 _l_ 4c 

7 10 3 3 




9 1 1 1 5c 

10 1 42 3 
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Fig. 4. Values of Pi a). 



7 Conclusions 

We have shown the optimality of Variable Harmonic among bounded space 
algorithms for variable sized bin packing. A number of open questions remain: 

1. The curve investigated in Section^ seems to have the property that 

the closer we examine it, the more detail it reveals. I.e. it is “fractal like” in 
some sense. What can be said about this curve? 

2. What general upper and lower bounds can be proved for variable-sized bin 
packing? 
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^ Institut fiir Mathematik, TU Graz, Austria. 



Abstract. We study online bounded space bin packing in the resource 
augmentation model of competitive analysis. In this model, the online 
bounded space packing algorithm has to pack a list L of items in (0, 1] 
into a small number of bins of size & > 1. Its performance is measured 
by comparing the produced packing against the optimal offline packing 
of the list L into bins of size 1. 

We present a complete solution to this problem: For every bin size & > 1, 
we design online bounded space bin packing algorithms whose worst 
case ratio in this model comes arbitrarily close to a certain bound p{h). 
Moreover, we prove that no online bounded space algorithm can perform 
better than p{h) in the worst case. 

Keywords. Online algorithm, competitive analysis, resource augmenta- 
tion, approximation algorithm, asymptotic worst case ratio, bin packing. 



1 Introduction 

Resource augmentation (or extra-resource analysis) is a technique for analyzing 
online algorithms that was introduced in 1995 by Kalyanasundaram & Pruhs 
It is a relaxed notion of competitive analysis in which the online algorithm 
is given better resources than the optimal offline algorithm to which it is com- 
pared. This is e.g. the case, if the machines of the online algorithm run at slightly 
higher speed than those of the offline algorithm, or if the online algorithm has 
more machines than the offline algorithm, or if the production deadlines of the 
online algorithm are less stringent than those of the offline algorithm. The main 
idea behind the resource augmentation technique is to give the online algorithm 
a fairer chance in competing against the omniscient and all-powerful offline algo- 
rithm from classical competitive analysis. During the last few years the resource 
augmentation technique has become a very popular tool, and it has been applied 
to many problems in scheduling (cf. e.g. Phillips, Stein, Torng & Wein |H| and 
Edmonds Pj), in paging (Albers, Arora & Khanna P3 )j ™ combinatorial 
optimization (Kalyanasundaram & Pruhs pj). In this paper we will study online 
bounded space bin packing in this resource augmentation model. 

In the classical bin packing problem, a list L = (ai, 02 , . . .) of items G 
[0, 1] has to be packed into the minimum number of unit-size bins. The offline 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 2flfi- RiT31 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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optimum OpTi(L) is the minimum number of unit-size bins into which the items 
in L can be fit. A bin packing algorithm is called online if it packs all items 
solely on the basis of the sizes of the items aj, 1 < j < i, and without any 
information on subsequent items. A bin packing algorithm uses k-bounded space 
if for each item a^, the choice of bins to pack it into is restricted to a set of k or 
fewer active bins. Each bin becomes active when it receives its first item, but once 
it is declared inactive (or closed), it can never become active again. An online 
bounded space bin packing algorithm is an online algorithm that uses fc-bounded 
space for some fixed value fc > 1. The bounded space restriction models situations 
in which bins are exported once they are packed (e.g., in packing trucks at a 
loading dock that has positions for only k trucks, or in communication channels 
with buffers of limited size in which information moves in large fixed-size blocks) . 

We investigate the behavior of online bounded space bin packing algorithms 
that pack the list L into bins of size 6 > 1. This larger bin size b is the augmented 
resource of the online algorithm; the offline algorithm has to work with bins of 
size 1. For an online algorithm A and a bin size b, we denote by A\,{L) the number 
of bins of size b that algorithm A uses in packing the items in L. The worst case 
performance of algorithm A for bin size b, denoted by Rb{A), is defined as 

Rh{A) = lim sup A{,(L)/Opti(L). 

Opti(L)->-oo L 

A small worst case performance means a good quality of the online algorithm. 
Online bin packing is a classical problem in optimization and theoretical com- 
puter science. We refer the reader to Csirik & Woeginger |5j for an up-to-date 
survey of this area. 

Our results and organization of the paper. In this paper we present a 
complete analysis of online bounded space bin packing in the resource augmen- 
tation model: For every bin size b > 1, we determine the best possible worst 
case performance p{b) over all online bounded space bin packing algorithms. 
The precise values p{b) are defined in Section |3 In Section 0 we state several 
auxiliary results. In Section 0 we discuss technical properties of the function 
p(b). In Section 0 we design and analyze an online algorithm whose worst case 
performance comes arbitrarily close to p{b). Finally, in Sectional we prove that 
no online algorithm can beat the bound p{b). 

2 Statement of the Main Result 

Throughout the paper, L = (oi, 02 , . . . , a„) is a list of items in (0, 1], and 6 > 1 
is the bin size for the online algorithm. We associate with b an infinite sequence 
T{b) = {ti,t 2 , . . .) of positive integers as follows: 

ti = [l + b\ and (1) 

O ti 

and for i = 1, 2, . . . 

J and 

n H+i 



(2) 
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1.8 



1.4 



1.6 



N 



1.2 



1 




0.8 



0.6 




0.4 



0.2 



0 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4 4.2 4.4 



Fig. 1. The graph of the function p(6). 



An equivalent way for defining this sequence T(b) is the following: Suppose 
that we want to fill a bucket of size 1/6 greedily with reciprocal values of positive 
integers. First, we pack the largest possible reciprocal value that fits into the 
bucket, but without filling it completely. Then we add the largest reciprocal 
value that fits without filling the rest capacity completely, and then this process 
is repeated over and over again. In this ‘bucket’ interpretation, the value 
represents the rest capacity after the reciprocal value of the positive integer ti 
has been put into the bucket. Note that the smallest integer whose reciprocal 
would fit into a space of r < 1 is |"l/r]. If 1/r happens to be an integer, we 
must not fill the bucket completely, and hence we have to pack the reciprocal of 
[1/r] + 1 instead. The reader may want to verify that the recursive definitions in 
m and 0 exactly agree with these interpretations. Alltogether, this discussion 
demonstrates that 



In Section 0 we will prove that the infinite sum in the righthand side of 0) 
converges for every value of 6. The following lemma provides the reader with 
some intuition on the (somewhat irregular and somewhat messy) behaviour of 
the function p(b); see also the picture in Figure [Dfor an illustration. The lemma 
will be proved in Section 0 




( 3 ) 



Finally, we define 




( 4 ) 
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Lemma 1. The function p{b) : [l,c») — ^ IR has the following properties. 

(i) p(l) ft! 1.69103 and p(2) a; 0.69103. 

(ii) 1/m < p{m) < l/(m — 1) for integers m>2. 

(Hi) p{b) is strictly decreasing on [l,oo). 

(iv) As b tends to 2 from below, p{b) tends to 1. As b tends to infinity, p{b) tends 
to 0. 

(v) At every irrational value of b > 1, the function p{b) is continuous. 

(vi) At every rational value ofb>l, the function p{b) is not continuous. 

The following theorem summarizes the main result of this paper. Its proof is 
split into the proof of the upper bound in Theorem Q in Section and into the 
proof of the lower bound in Theorem 0 in Section 0 

Theorem 2. (Main result of the paper) 

For every bin size 6 > 1, there exist online hounded space bin packing algorithms 
with worst case performance arbitrarily close to p{b). For every bin size b > 1, the 
bound p{b) cannot be beaten by an online hounded space bin packing algorithm. 

Note that by setting 6 = 1 in Theorem |2| we get a worst case performance of 
/o(l) ~ 1.69103. Hence, this special case reproves the well-known result of Lee & 
Lee |S| on classical online bounded space bin packing. 



3 Some Useful Facts 



In this section we collect several facts on the sequence T{b) that will be used 
in the later sections. First, we observe that for every b > 1 the corresponding 
sequence T{b) = {t\,t 2 ,...) is growing rapidly: By the equations in (EJ), we 
have ri-i < l/{ti — 1) and 1/ti+i < — 1/ti. Consequently, 1/ti+i < 

l/{ti — 1) — 1/ti. Rewriting this yields the inequality ti+i > ti(fi — 1), which in 
turn is equivalent to 



ti+i — 1 > ti{ti — 1) for all i >1. 



(5) 



Next, consider some fixed index j > 1. A straightforward inductive argument 
based on © yields that tj+k — 1 > {tj — 1)^+^ holds for all fc > 0. From this we 
get that 






k=0 



- 1 






-k-l 






tj 2 



( 6 ) 



For j = 1 this inequality demonstrates that the infinite series in equation m 
indeed converges, and that the function p(b) is well-defined. 

The following result will be used in the proof of Lemma El 

Lemma 3. Let z > 1 be an integer. Then the sequence T(/b) fulfills the inequality 



tz + 1 

tz 




< 



^ k-l 

i—z 



(7) 



Proof. Omitted in this version. 



□ 
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4 Some Properties of the Function p{b) 



This section is devoted to the proof of Lemma D Since by © the underlying 
series converges fast, the values p(l) and p(2) in statement (i) of Lemma Dare 
easy to approximate by a computer program. For statement (ii), consider an 
integer m > 2. Since the sequence T(m) starts with t\ = m + 1, the definition 
of p{b) in (0) immediately yields p{m) > 1/m. Moreover, by setting j = 1 in 
inequality (0 we get that 



p{m) 



OO 



1 



< 



1 

t\ — 2 



< 



1 

m — 1 



for all integers m >2. (8) 



This completes the proof of statement (ii). We turn to statement (iii). Let 1 < 
a < b, and let T(a) = {U) and T{b) = (t[) denote the two infinite sequences 
associated with a and b. Define j > 1 to be the smallest index with tj ^ t'. 
Since a < b, this implies tj < — 1. Then 



p{a) - p{b) = 



= 












> 






^;-2 



> 0 



(9) 



where we used 0 to derive the first inequality and tj < tb — 1 in the second 
inequality. Hence a < b indeed implies p{a) > p{b). 

Next, we turn to statement (iv). Let m > 2 be an integer and consider the 
value bm = 2m/{jn + 2). It can be verfied that the series T{bm) starts with 
the term ti = 2, which is followed by the all the terms of the sequence T{m). 
Consequently, p{bm) = 1 + p(w) holds and from (EJ we get that 1 + 1/m < 
p{bm) < l+l/(m— 1). As m goes to oo, bm tends to 2 from below, and p{bm) tends 
to 1 from above. Since p(h) is a decreasing function by statement (iii), we have 
thus proved the first part of statement (iv). The second part of statement (iv) 
follows by combining statements (ii) and (iii). 

The (very technical) proofs of statements (v) and (vi) will be given in the 
full version of this paper. 



5 Proof of the Upper Bound 

In this section, we prove the upper bound stated in TheoremQ As usual, let 6 > 1 
denote the bin size, and let T{h) = . . .) be the integer sequence associated 

with b. Let £ > 3 be an integer. We introduce intervals Ij with j = 1, . . . , 
that form a partition of the interval (0,6]. For 1 < j < — 1, we define the 

interval Xj — {j^, |j- Moreover, we define the last interval Xt^ = (0,6/t^j. 

Our online algorithm keeps one active bin Bj for every interval Xj (j = 
1, . . . , t^). All items from the interval X,- 0(0, 1] are packed into the corresponding 
active bin Bj. If a newly arrived item does not fit into Bj, this bin is closed, and 
a new corresponding bin for interval Xj is opened. In other words, the items from 
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interval Xj fl (0, 1] are packed into the active bins Bj according to the NEXT-FIT 
algorithm. This completes the description of the online algorithm. 

To analyze this online algorithm, we define the following weight function 
w : (0, 1] — >■ K. For items x in Ij with 1 < j < — 1, we define w{x) = 1/j. For 

items X in the last interval we define w{x) = {xti)/{bti — b). The weight of a 
packed bin equals the sum of the weights of the items contained in this bin. The 
weight w{L) of an item list L equals the sum of the weights of the items in L. 

Lemma 4. Every bin of size b that has been closed by the online algorithm 
contains items of total weight at least 1. 

Proof. First assume that the closed bin belongs to an interval Ij with 1 < j < 
te — 1. Then it contains exactly j items, and each of these items has weight 1/j. 
Next assume that the closed bin belongs to the interval Then the bin has 
been closed, since a new item from did not fit into it. Hence, the total size 
of its items is at least b — b/tg. Since on the interval It. the weight function is 
linear with slope ti/{bti — b), the weight of such a bin is at least 1. □ 

Lemma 5. Let l<z<£— 1 be an integer. Then for every positive real number 
X < b/tz, we have w{x)/x < (tz + l)/{btz). 

Proof. Omitted in this version. □ 

Lemma 6. In any packing of the list L into unit-size bins, every unit-size bin 
receives items of total weight at most 



Proof. Consider some fixed unit-size bin B that contains the items /i > /2 > 
■ ■ ■ > fn with total size at most 1. We distinguish three cases. 

(Case 1) For i = 1, . . . ,£ we have fi S {b/ti, b/{ti — 1)]. We denote by F the 
sum of the sizes of the remaining items fi with i > £. By the definition of the 
values ti in dH) and (0, we conclude that 



Hence, all items fe+i, ■ ■ . , fn are in the last interval I... By the definition of the 
weight function, the weight of the bin B then is upper bounded by 




(10) 




( 11 ) 
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Here we used dD to derive the first inequality, and 0 to derive the second 
inequality. This completes the analysis of the first case. 

(Case 2) There exists an integer z with 1 < z < such that the following 
holds: For j = 1, . . . , z — 1 we have fi G {h/ti, h/{ti — l)\. Moreover, either does 
not exist (since n = z — 1 holds) or if it does exist then ^ {b/tz,b/ — 1)] 
holds. We denote by F the sum of the sizes of the remaining items fi with i > z. 
Similarly as above, we observe that 



F 



z-l 



E /* ^ i-E 







( 12 ) 



By combining m with m we get that the total size F of all items f^, ■ ■ ■ , fn is 
at most b/ (t^ — 1). Since the largest one of all these items, /^, is not contained in 
the interval {b/t^, b/{tz — 1)], we conclude that the size of every item /z, ...,/„ is 
at most b/t^. Then by LemmaEI their overall weight is at most F{t^ + l)/{bt^). 
The weight of the bin B is at most 



\ " 1 Fjtz + 1 + ^ 1 

^ — 1 — 2-^ f.- — 1 2-^ i: — 2^ 



■ 1 1 bz . ti 

1—1 1—Z 



^E 



^ E 



i=l 

1 



U - 1 



te+1-2 - {ti-iy 

1—1 1—1 ^ ' 



Here we have first applied (H2i) to bound F from above, then the statement in 
Lemma 0 then the inequality in (0 to bound ^ I (,bi — 1) from above, and 

in the end the inequality (0) together with tg > 2. This completes the analysis 
of the second case. 

(Case 3) This case is essentially the second case with z = £, which needs 
special treatment since the statement in Lemma 0 does not carry over to z = £. 
Assume that for j = 1, 1 we have fi G {b/ti, bj(fi — 1)], and that fi ^ 

(6/tf, bj(ti — 1)]; the subcase where fn does not exist is trivial. We denote by F 
the sum of the sizes of the items fi with i>l. 



F 



n 



E /* 



(-1 



^ i-E 



b 



ri-i < 



b 

U-l' 



(13) 



Consequently, all items f^, . . . , fn are contained in the last interval Itf Then the 
weight of the bin B is at most 



ti~ 1 bti — b ~ ^ ti — 1 (t£ — 1)2 ^ ti — 1 {ti — 1)2 

2 = 1 2=1 2=1 



Here we used to bound F. This completes the proof. □ 

Theorem 7. For any bin size b > 1 and for any real e > 0, there exist a 
suffieiently large k and an online k-bounded spaee bin packing algorithm A with 
Rb{A) < p(b) + £. 
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Proof. Choose a sufficiently large integer £ > 3 such that l/{te — 1)^ < e is 
fulfilled. Then we derive from Lemma2|that Ai,{L) < w{L), and we derive from 
Lemmal^that w{L) < {p{b) + e) • OpTi(L). □ 



6 Proof of the Lower Bound 



In this section, we prove the lower bound stated in Theorem El Consider an 
arbitrary online fc-bounded space algorithm A for bin packing with bin size h. Let 
T{b) = ■ ■) be the integer sequence associated with b. Let £ be an integer, 

and let e > 0 be a small real number such that e • • £ < 1. Furthermore, let 

N > be a huge integer. We confront the online algorithm with several 

phases of ‘bad’ items, and we show that algorithm A eventually must perform 
poorly. 

Alltogether there are £ phases. In the jth phase (j = !,...,£), exactly N 
items of size b/ti-j+i + e arrive. The best that the bounded space algorithm A 
can do is to pack these items together in groups of cardinality — 1 each. 

This consumes — 1) bins. At the beginning of a phase up to k used bins 

of the previous phase are active, and this may save up to k bins. Summarizing, 
algorithm A uses at least — 1) — k bins for packing the items of phase 

j. Adding this up over all j = 1, ...,£, we get that 



A,{L) > 



N 



ti- 



i+i 



- 1 



-k] = n-y: 



i=i 



C - 1 



- fc£. 



(14) 



By (j3D and by the choice of e, the £ items b/ti-j+i + e with 1 < j < £ together 
fit into a bin of size 1. Consequently, we have OpTi(L) < N. By making N 
sufficiently large, llldil yields that the worst case performance Rb{A) of algorithm 
A is at least Yfj=i Since this statement holds true for every value of £, we 
may make £ arbitrarily large and thus make this bound arbitrarily close to p{b) . 



Theorem 8. For any h > 1 and for any online k-bounded space bin packing 
algorithm A, we have Rb{A) > p{b). □ 
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Abstract. The list update problem is a classical online problem, with an optimal 
competitive ratio that is still open, somewhere between 1.5 and 1.6. An algorithm 
with competitive ratio 1.6, the smallest known to date, is COMB, a randomized 
combination of BIT and TIMESTAMP. This and many other known algorithms, 
like MTE, are projective in the sense that they can be defined by only looking 
at any pair of list items at a time. Projectivity simplifies both the description of 
the algorithm and its analysis, and so far seems to be the only way to define a 
good online algorithm for lists of arbitrary length. In this paper we characterize all 
projective list update algorithms and show their competitive ratio is never smaller 
than 1.6. Therefore, COMB is a best possible projective algorithm, and any better 
algorithm, if it exists, would need a non-projective approach. 



1 Introduction 

The list update problem is a classical online problem in the area of self-organizing data 
structures 0. Requests to items in an unsorted linear list must be served by accessing 
the requested item. We assume the partial cost model where accessing the ith item in 
the list incurs a cost of i — 1 units. This is simpler to analyze than the original /m/Z cost 
model ina where that cost is i. The goal is to keep access costs small by rearranging 
the items in the list. After an item has been requested, it may be moved free of charge 
closer to the front of the list. This is called a. free exchange. Any other exchange of two 
consecutive items in the list incurs cost one and is called a paid exchange. 

An online algorithm must serve the sequence a of requests one item at a time, without 
knowledge of future requests. An optimum offline algorithm knows the entire sequence cr 
in advance and can serve it with minimum cost OFF (a). If the online algorithm serves 
cr with cost ON (a), then it is called c-competitive if for a suitable constant b 

ON {a) < c ■ OFF{a) + b 

for all request sequences a. The competitive ratio c in this inequality is the standard 
yardstick for measuring the performance of the online algorithm. The well-known move- 
to-front rule MTF, for example, which moves each item to the front of the list after it has 
been requested, is 2-competitive caia. This is also the best possible competitiveness 
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for any deterministic online algorithm for the list update problem [O. Another deter- 
ministic algorithm that is also 2-competitive is TIMESTAMP due to Albers m, which 
moves the requested item x in front of all items which have been requested at most once 
since the last request to x. 

Randomized algorithms can perform better on average, as first shown by Irani fTOUTi - 
Such an algorithm is called c-competitive if 

E\ON{(j)\ < c ■ OFF {a) + b, 

where the expectation is taken over the randomized choices of the online algorithm. 
Randomization is useful only against the oblivious adversary |@ that generates request 
sequences without observing the randomized choices of the online algorithm. If the 
adversary can observe those choices, it can generate requests as if the algorithm was 
deterministic, which is then at best 2-competitive. We therefore consider only the inter- 
esting situation of the oblivious adversary. 

In this case, lower bounds for the competitive ratio are harder to find; the first non- 
trivial bounds are due to Karp and Raghavan, see the remark in fT2l . A general technique 
is Yao’s theorem O: If there is a probability distribution on request sequences so that 
the resulting expected competitive ratio for any deterministic online algorithm is d or 
higher, then every deterministic or randomized online algorithm has competitive ratio d 
or higher |l2i|. In the partial cost model, a lower bound of 1.5 is easy to find as only two 
items are needed. Teia M generalized this idea to prove the same bound in the full 
cost model, where long lists are needed. Ambiihl, Gartner and von Stengel showed 
a lower bound of 1.50084 for lists with five items in the partial cost model, using game 
trees and a modification of Teia’s approach. The optimal competitive ratio for the list 
update problem (in the partial cost model) is therefore between 1.50084 and 1.6, but the 
true value is as yet unknown. 

The upper bound of 1 .6 is the last link so far in a chain of results, starting with the 
observation that MTF acts too eagerly in moving items to the front. The better algorithms 
BIT, COUNTER, and RANDOM RESET move the requested item to the front or leave 
it at its position, depending on the number of previous requests to the currently requested 
item. The elegant BIT algorithm stores a data bit — initially set to a random value — with 
each item. The bit is flipped at each request, and the item is moved to the front of the list 
when the bit has been set to one. BIT is 1 .75-competitive. The related RANDOM RESET 
algorithm has competitive ratio about 1.73. This ratio is improved to the Golden 
Ratio (1 4- v^) /2, about 1.62, by treating each item with a randomized combination of 
TIMESTAMP and MTF Q. 

The best randomized list update algorithm known to date is the 1.6-competitive 
algorithm COMB Q. It serves the request sequence with probability 4/5 using BIT 
ca. With probability 1/5, COMB treats the request sequence using TIMESTAMP. 

With the exception of Irani’s algorithm SPLIT 111 011 111 . all the specific list update 
algorithms mentioned above are projective, meaning that the relative order of any two 
items in the list only depends on previous requests to those items. (A simple example for 
a non-projective algorithm is TRANSPOSE, which moves the requested item just one 
position further to the front.) The main result of this paper is a proof that, surprisingly, 
1.6 is the best possible competitive ratio attainable by a projective algorithm. As a tool. 
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we develop an explicit characterization of deterministic projective algorithms in terms 
of two functions for every item, responsible for the “macro”- and the “micro”-behavior 
of the item. 

This result is significant in several respects. First, it puts an end to the search for 
improved algorithms via combinations of existing projective ones. This approach has 
been used successfully in the past, as the results mentioned above indicate, but it has 
reached its limits with the development of the COMB algorithm. New and better algo- 
rithms (if they exist) have to be non-projective, and must derive from new, yet to be 
discovered, design principles. Second, the characterization of projective algorithms is a 
step forward in understanding the structural properties of list update algorithms. Under 
this characterization, the largest and so far most signihcant class of algorithms appears 
in a new, unihed way. Third, our lower bound construction gives rise to an explicit test 
scenario for new algorithms: we construct a set of request sequences with the property 
that a randomly chosen instance from the set is “hard” for any projective algorithm. A 
new, supposedly better, algorithm should therefore be able to defeat those hard instances, 
and this might be more difficult than to defeat some ad-hoc set of instances. 

2 Projective Algorithms 

Consider a list with n items. For a request sequence a and two list items x and y, 
the projection of cr on the unordered pair {x, y} is denoted by axy and defined as the 
sequence obtained from a by deleting all requests to items other than x or y. We write 
ax instead of cr^x- For a given deterministic online algorithm, let S{a) denote the list 
state after the request sequence a is served. List states are written as [xiX 2 ■ ■ ■ x„\ where 
x\ is the item at the front of the list. The list state projected to the pair {x, y} is denoted 
by Sxy{a), which is either [xy] or \yx], indicating the relative position of x and y after 
a is served. 

A deterministic algorithm is projective if the relative order of any two items after 
any sequence a does not depend on requests to the other items: 

Definition 1 (Projective Algorithms). A deterministic list update algorithm is projec- 
tive if for all request sequences a and any two list items x and y 



A projective algorithm A can be analyzed in a much simpler way than a general one, 
since only two-item lists have to be studied ^2'J- If the projected cost on two items x,y of 
this algorithm is Axy (with each request to a; or y contributing either 0 or 1, depending 
on whether the requested item is before or behind the other item in the list), then its total 
cost is given by 




( 1 ) 




( 2 ) 



{x,y}QL 



where L is the set of list items ll2i8l . Furthermore, the optimal offline cost OFFxy of 
serving axy is easy to describe, and gives a lower bound OFF{a) dehned similar to 0) 
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for the optimal offline cost. This quantity is used to prove the competitive ratio of COMB 
and other algorithms. The simple equation 0) holds only in the partial cost model, which 
is therefore the natural choice when studying projective algorithms. 

Projective algorithms have a natural generalization, where we demand the relative 
order of any fc-tuple of list items to depend only on the requests to these k items. It turns 
out that for lists with more than k items, only projective algorithms satisfy this condition. 
This follows from the fact that e.g. for k = 3, Sxyzicr) = Sxyz{o'xyz) implies that the 
relative order of any pair from {a;, y, z} is independent of the requests to any item in 
L \ {x, y, z}. As soon as we have k + 1 list items, it is easy to see that the constraints 
for all fc-tuples enforce the relative order of any pair of list items to be independent of 
the requests to other items. 

We define a randomized online algorithm as projective if it is a (not necessarily 
finite) probability distribution over deterministic projective algorithms. A less restrictive 
definition is conceivable, but would not allow us to prove the lower bound for projective 
algorithms that we intend and that we think is useful. Namely, one could call a randomized 
online list update algorithm projective if serving any request sequence tr induces a 
distribution on list states Sxy{cr) that only depends on axy To illustrate the problem 
with this definition, consider the following randomized algorithm on a list of two items 
only: If the current list state is [a:y] and y is requested following a request to x, move y 
to the front with probability 1/2, and if y is requested following a request to y (that is, 
y was not moved at the preceding request), then move y to the front with certainty. It is 
not hard to see that such a randomized online algorithm has competitive ratio 1.5, which 
is the best possible. Furthermore, any randomized algorithm showing this as projective 
behavior on a longer list would also be 1.5-competitive. It is indeed possible to construct 
such an algorithm for lists with up to four items using partial orders but impossible 
for lists with five or more ifems, as recenfly proved by the authors m. 

In the following, we will characterize the deterministic projective algorithms in a way 
that makes their projective behavior transparent, and unifies many known algorithms. By 
our above assumption that considers a randomized projective algorithm as a probability 
distribution over deterministic ones, we will be able to use this characterization in the 
lower bound proof later. 

3 Critical Requests 

Consider a given deterministic online list update algorithm that is projective according 
to Definition 1 . In order to obtain a meaningful characterization of such an algorithm, 
we assume n > 2 since on lists with only two items, any algorithm is projective. Let 
i,j > 0 and consider request sequences cr with exactly i requests to x and j requests to 
y (x ^ y), that is, ax = x'" (the i-fold repetition of x) and ay = yL Then we say x® and 
y^ are equivalent, written x® ^ y^ , if there are request sequences a and a' so that 

^x = (y'x = ay = a'y = y\ 

Sxy{a) = [xy], Sxy{a') = [yx\. (3) 

In other words, one should be able to shuffle the requests to x and y such that either 
X or y is in front after serving the request sequence. Assume, for the moment, that O) 
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holds for any two items x and y and any i , j > 0 (we deal with the general case in the 
next section). Under this assumption, the algorithm can be characterized in terms of the 
central concept of critical requests. 

For any list item x, let ■ IN“'' IN^ be a function so that (*) < i for all i. Then 
Fx defines the critical request to x in a request sequence a as the F^ (i)th request to x in ct 
if (Ta; = xL We say that the given algorithm operates according to these critical requests 
if, after serving any request sequence a, the relative order of the items requested at least 
once is the reverse order of their critical requests in a. In other words, x precedes y after 
cr with CTa; = X®, ay = yF and i,j >0 if and only if the Fa;(i)th request of x was later 
than the Fy{j)th request to y. As a simple illustration, observe that MTF uses Fx{i) = i 
for alH > 0 and X G L. In the next section we will also deal with non-requested items. 
Paid exchanges can be represented by critical request functions that are not monotone. 

There is also a natural “dual” way to deal with critical requests: At any time, an 
item precedes another if its critical request was earlier. In this case, we say that the 
algorithm operates dually according to critical requests. For example, operating dually 
on Fxii) = i, for all i, results in the move-to-back algorithm. Although such behavior 
cannot be competitive, it defines a projective algorithm. 

Any critical request functions F^ for the list items x therefore define two list update 
algorithms. The algorithms are projective since Fj;{i) does not depend on requests to 
items other than x. Furthermore, condition (0 holds for i,j > 0: sequences a and a' 
with cTjjy = x^y^ nnda'^y = y-^x* will always result in S'2;y(cr) 7^ S'2;y(cr). The following 
theorem shows that, conversely, any projective list update algorithm fulfilling © arises 
from critical requests. 

Theorem 1. Let Abe a projective algorithm on a list with n items, n > 2 , so that for 
all items x, y and all i,j >0, property 0 holds. Then A operates (or operates dually) 
according to suitable critical request functions. 

Proof. Assume that i, j,k > 0 and a and a' are request sequences (over only three items 

X, y and z) with Ux = a'^ = x\ ay = a'y = yf and az = a'^ = z^, so that 

Sxy{(^xy) = [xy] and SxzifJxz) = [zx] (4) 

and 

Sxyicr'xy) = [yx] and Sxziu'xz) = M- (5) 

Such sequences exist by assumption 0 and projectivity of A, and since the projections 
considered in these equations can be combined into sequences a and a' of requests to 
the three items. 

Note that 0 and 0 imply S'y2(cr) = [zy\wd Syz(a') = [yz], hence by projectivity 

(Xyz F ^yz- (6) 

By labelling each request to an item with its position in the unary projection to that 
item (e.g. the fifth request to x will be labelled X(5)), axy and a'^y (and similarly axz 
and (T^^) can be considered as permutations that may be transformed into each other by 
successively transposing pairs of consecutive requests. 
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We can even assume that axy and differ only in a single such transposition, 
namely the (not necessarily unique) one responsible for the reversal of the list state Sxy 
during the transformation. Similarly, we assume that axz and differ only in a single 
transposition of consecutive requests to x and z. 

Suppose the transposition axy — >■ a'^y involves and while X(^r) and 
participate in the transposition axz We now prove that q = r, which is also 

easily seen to imply that this value is well-defined: It neither depends on a and a' nor 
on the specific fransposition we consider, buf only on ax- 

For fhis, assume q ^ r and consider fhe sequence a. W.l.o.g., and are 
consecutive requests in a — otherwise we can transpose with all in-between requests 
to z without changing the projections to {a;, j/} and {a:, z}. Similarly, suppose that X{^r) 
and Z{jn) are consecutive. A sequence a' satisfying © is now obtained from cr by 
transposing both X(^q) with and x^r) with Z(m). Under this operation, however, the 
projection to {y,z} remains invariant, a contradiction to ® . Hence, we must have q = r. 

We have seen that for all items x and all i, there is a unique critical value Fx{i) = q, 
and it remains to show that A operates (or operates dually) according to the Fx{i). 

First of all, we need to show that the relative order of any two items only depends 
on the order of their critical requests. By the above arguments, whenever axy and a'^y 
satisfy 



Sxyi^xy) ^ ^xy{^xy) 

and differ only by a single transposition of consecutive requests, this transposition in- 
volves the critical request of x. By symmetry, it also involves the critical request of y. In 
general, when we transform axy into a'^y by transposing consecutive requests, property 
O holds if and only if the two critical requests have been transposed an odd number of 
times, which fixes their relative order. 

Now consider a request sequence cr over an n-item list such that Sjcr) = \x1X2 ■ ■ - Xn]- 
Let Pi be the position of ’s critical request in cr. If we do not have Pi > P2 > ■ ■ ■ > Pn 
(A operates on F) or pi < p2 < • • • < Pn (A operates dually on F), we must have 
an index i such that either pi < pi+i > Pi+2 or pi > Pi+i < Pi+2- In both cases, we 
can manipulate a such that the critical requests of Xi and Xi+2 change their order, but 
both keep their relative order w.r.t. the critical request of x^+i. In the list obtained after 
serving cr, items Xi and Xi+2 change their relative order under this manipulation, while 
they keep their relative order w.r.t. Xi+i. This is impossible. □ 

The assumption of at least three list items in the preceding theorem is crucial. On 
lists with only two items, any algorithm is projective, but cannot always be defined in 
terms of critical requests. This follows from cardinality considerations: There are 
request sequences with i requests to x and j requests to y, each of which can have its 
own list state after being served, but only i ■ j many ways of defining critical requests. 
As an example, the algorithm that puts the second-to-last requested item at the front of 
the two-item list cannot be defined in terms of critical requests. 
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4 Containers 

Not all projective algorithms fulfill condition 0 for all X* and . As an example, 
consider the algorithm where all items requested an odd number of times precede all 
items requested an even number of times, and where the items within each of these two 
sets are arranged according to the MTF rule. Then (0 fails if i is odd and j is even or 
vice versa. According to the general characterization of projective algorithms that we 
will give in this section, the odd- and even-requested items in this example form separate 
“containers” that represent sublists of the list. Within one container, items are moved 
according to critical requests, but items in different containers are always in one and the 
same relative position. 

To make this precise, consider a given projective list update algorithm and the set 
U = {x* I z S TN, X G L} 

of unary projections of request sequences, where L denotes the set of items in the list. 

Recall that we write x® ^ whenever Q) holds. It makes sense to allow z = 0 and 
j = 0 as well. By generalizing Q) we get x° / y^ for all x ^ y and j G IN. 

For X, y, z distinct, if x* ^ y^ and y^ ~ z^, then, by projectivity, there are always 
two sequences cr and a' containing the three unary projections such that S{a) = [xyz] 
and S{a') = [zyx], which implies x* ^ z^. It is easy to see that ~ is an equivalence 
relation on U if we stipulate x* ~ x* for any x^ G U and also x* ~ x^ for i ^ j if and 
only if there is a z^ with z ^ x such that x* ^ z^ and z^ ~ x^ . 

An equivalence class under ^ shall be called a container. Let C denote the set of 
containers and Cx{i) denote the container containing xL If x* and y^ are in different 
containers, we write Cx{i) < Cy{j) whenever for some a with Ox = x* and Oy = y^ 
we have Sxy{ct) = \xy] (and hence for all such a, since 0 does not hold). It is easy 
to see that this does not depend on the choice of the representatives x® and y^ from 
each container. A special case occurs if both x® and x^ are the only members in their 
respective containers. In this case, no canonical order exists, and we set Cxii) < Cx{j) 
if z < j. 

Then < defines a total order on the containers, which has a natural interpretation: 
After a request sequence a, consider the unary projections Ox for each item x, and their 
corresponding containers. If two projections x* and y^ are in different containers, x is 
in front of y if and only if Cx{i) < Cy{j). Hence the containers represent sublists of the 
list state S{a), and we will say that Cx{i) contains the item x if cTj; = xL 

This characterizes any projective algorithm, apart from its behavior within each 
container, which is easy to describe: If there is only one item in the container, the position 
of the item is that of the container. A container containing only unary projections for 
two distinct items can have an arbitrary behavior of its items, since any algorithm on 
only two items is projective. If the container contains projections for at least three items. 
Theorem 2 applies, that is, the algorithm operates or operates dually according to suitable 
critical requests defined for the unary projections in that container. For this, observe that 
by definition of all empty projections x° are in a container of their own, so whenever 
a container has at least two items, each item has been requested at least once, in which 
case the critical requests exist. 
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To summarize, we can express any deterministic projective algorithm by a pair of 
functions c^; : IN — >■ C and : IN'*' — ?> IN^ for all a: G L = {xi,X 2 , • ■ • , Xn}- The 
ordering of the values (0) corresponds to the initial list state. In the following examples, 
we denote the containers by integers, where the ordering of the integers corresponds to 
the ordering of the containers. 



MTF moves all items into a common container 0 at their first request. All items use 
= (i, 0, 0, 0, ... ) and F^. (fc) = k. 

TIMESTAMP moves all items into a common container 0 at their second request. By 
definition of an item cannot stay in its initial container after the first request, so 
all items will use Cxi = (2i, 2i — 1, 0, 0, 0, . . . ) and Fx^ (k) = max(l, k — 1). 

FREQUENCY COUNT makes heavy use of containers. Items are ordered according 
to the number of requests to them, so all items requested k times are in container 
—k. The functions used are Cx^ = {i, —1, —2, . . . ) and {k) = k. 

BIT is a randomized algorithm. In the beginning, every item tosses a fair coin to decide 
whether it uses the pair (Fi?. , c° ) or (F.}., ci..) with c° = (2i, 2i — 1, 0, 0, 0, . . . ), 
F° = (1, 1, 3, 3,5,5,... ), 4 = (2z, 0,0, 0,‘. . . ), Fl'= (1, 2, 2, 4 , 4 , 6 , 6 , . . . ). 



5 Lower Bound 



In this section, we use the characterization of projective algorithms from the previous 
section to prove that no such algorithm is better than 1.6-competitive. Algorithms with a 
good competitive ratio never operate dually according to critical requests, and have the 
critical request close to the last request, for every item. That is, they fulfill i — Fx (*) < 1 
most of the time. Therefore we work with fx{i) = i — Fx{i) in the following. Recall 
that MTF is defined by fx{i) = 0, and TIMESTAMP by /„(i) = 1. 

Motivated by this discussion, we consider projective algorithms A for lists of more 
than two items that fulfill the following additional assumptions: 

(i) A constant p exists such that after at most p requests to every item, all items 
will reside in a single common container, and A operates (i.e. does not operate 
dually) according to critical requests within that container, and 

(ii) the values fx{i) that determine the critical requests can be uniformly bounded 
by some constant M, where w.l.o.g. M > 3. 

Let us call an algorithm satisfying (i) and (ii) regular. 

Given any e > 0 and b, we will show that there is a probability distribution tt on a 
finite set A of request sequences so that 



AgA 



^(A) 

OFF{\) + b 



> 



1.6 — e, 



( 8 ) 



for any deterministic regular algorithm A. Then Yao ’s theorem G3) asserts that also any 
randomized regular algorithm has competitive ratio 1.6 — £ or larger. This holds for any 
fixed p and M. Hence the competitive ratio is at least 1.6. This is achieved by COMB 
and therefore a tight bound for projective algorithms. 

The same holds for general projective algorithms, but we defer the technicalities of 
that to the full paper (see also section 6). 
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All X G A will consist of only two items x and y. With the constant M from (ii), let 

(/. = yxyx^ yx^ y^ xyxy^ xy^ x^ y^ . (9) 

(j) consists of eight blocks, each of which ends in x^ or y^ . Let H = |</)|/2, and let k 
and T be arbitrary positive integers. Then the set of sequences in 0 is given by 

A = {x*x^+^y^+^^^ j0<h<ff,0<t<T}, (10) 



where any A in A is chosen with equal probability 1 /HT by tt. 

First, observe that OFF pays ten units for each repetition of (p (which always starts 
in offline list state [yx]), and therefore all sequences in A have equal offline cost 1 + lOfc. 
This and the fact that 7 t(A) for A G A is constant allows us to rewrite 0 as 



SAgyl 

ExeAiOFF{X)+b) 



> 1.6 — e. 



(11) 



The offline cost OFF{X) in O, as well as the online cost A(A), can grow arbitrarily 
large with k, so that we can assume w.l.o.g. that & = 0 in O, adapting e suitably. 

In the rest of this section we show that (DB yields the desired property (□J- We say 
that A is in state (i,j) if it has served tr with = x® and Oy = y^ , where a is some 
prefix of a sequence A in A. That sequence cr is a random variable since the particular 
order of requests to x and y is usually not known. 

We say that a sequence A in A switches from x in state (i, j) if A has the prefix a 
with Ox = X® and Oy = y^ and a ends in x^ . Similarly, we say that A switches from y 
in state {i,j) if that prefix a ends in y^ . A state (i,j) is called good if it fulfills the 
following conditions: 

(a) there are four sequences in A that switch from x in state They continue 

with the requests y^ , y^ , yx^ , and yxyx^ , respectively; 

(b) the same holds with x and y interchanged; 

(c) properties (a) and (b) also hold for the states {i — l,j) and {i,j — 1). 

This means, for every good state {i,j) and each of the eight blocks in there is 
exactly one sequence tr in A that starts with this block in state Let A{i,j) be 

the sum of costs incurred by A on those eight blocks, and OFF{i, j) the corresponding 
sum of offline costs. In general, A{i,j) denotes the sum of costs incurred by A on all 
next blocks of the (at most eight) sequences switching from x or y in state {i,j), and 
OFF{i, j) is the corresponding offline cost. 

We will show that for every good state {i, j), A{i,j ) is at least 16, while OFF{i,j) is 
always 10. Together with the facts that most states are good states (which we will prove 
below), and that the cost for serving the initial prefix is independent of k 
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and can therefore be neglected for large k, this gives 

Syg/t ^('^) _ (».i) M 

EAgyi OFF{X) ^OFF{i,j) + (cc*x^+'‘y^+'‘) 

(i,j) h,t 

(t,j) good . j) 

~ > rnin ^ r, 

OFF{i,j) (i,j) goodOFF{i,j) 

(i.j) good 



thus proving the lower bound, because the minimum is at least 16/10 = 1.6. 

By considering each of the four continuing sequences , yx^ , and yxyx^ 

in (a), we see that the sum of their offline costs is live. Hence, we have to show that 
the sum of online costs is at least eight. This is not always the case: It is possible that a 
certain choice of the critical request functions will result in an online cost of only seven 
units. However, we will show that those particular critical requests will incur nine units 
of online cost in state {i — 1, j), one of which we can “borrow” for This results 

in eight units of online cost, for all good states. 

Consider a sequence in a good state (i,j) that, as in (a), switches from x, so that 
the next requests are y^ , y^ , yx^ , and yxyx^ (our reasoning will then apply to (b) 
by symmetry). Since the first two of these requests are yy, yy, yx, yx, their total online 
cost is six units: By assumption (ii), the critical request to x is among the preceding 
requests x^ . Hence four units are to be paid for the first request to y, and then two extra 
units, namely on the request to x in yx and yx if /y(j + 1) = 0, and on the second 
request to y in yy and yy if /y (j + 1) > 1. A seventh unit is spent in serving the second 
request to y or to a; in yxyx^^ . The only case where no more than these seven online 
cost units occur is if 



/,(^ + l) = 0 and /y(j + 2) = l. (12) 

Namely, fy{j + 2) < 1 is necessary since otherwise y would not be in front of x 
at the third request to y in the sequences y^ , adding at least two more cost units. If 
fy{j + 1) = then we need /a;(i + 1) = 0 to avoid another cost unit when serving the 
second request to x in ya;^ , and hence fy{j + 2) > 1 to avoid that y is moved to the front 
at the second request to y in yxyx^ since that would create an extra cost unit for serving 
the second x. If fy{j + !)>!, then we need again f^{i + 1) = 0 and fy{j + 2) > 1 
to avoid that y is moved to the front at the second request to y in yxyx^ . Together, this 
implies (TTH . Hence, whenever G2I) fails, the online algorithm incurs eight or more unit 
costs. 

The seven online cost units in case o create nine online cost units in state — 
for the sequences switching from y. Namely, by (c), the subsequent requests are x^ , 
x^ , xy^ , and xyxy^^ . As before, six online units will be spent on the first two of these 
requests which are xx, xx, xy, xy, and a seventh unit either on the second y in xy^ 
or on the second x in xyxy^ . That last sequence, however, will incur two additional 
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cost units: Because of o, X is in front of y after the third and forth request in xyxy^ , 
causing two more units for serving the second and third requests to y. 

It remains to show that most states are good. The sequences A in A are, hy (tlTTIi . 
essentially fc-fold repetitions of (p. 

The random initial subsequence of A means that all pairs of states (i,j) 

and (i + 1, j + 1), apart from those with low or high values of i and j, are equally likely 
reached by any request in p. Note that \(px\ = \4>y\- The additional prefix x* of A does 
the same for the pairs of states {i, j) and {i + l,j) except for those with small or large 
values of i and j. Figure 0 displays the good and the bad states in a diagram. There 
are 0{kHT) good states, but only 0{kH‘^) bad ones. By choosing T large enough, the 
contribution of bad states can be neglected, so that holds. 




X- counter value 

Fig. 1. xy-diagram 

We have proved a lower bound of 1.6 for the competitive ratio of any regular pro- 
jective algorithm A, defined by a probability distribution over deterministic regular 
algorithms. By definition, this means that A is using only one active container in the 
long run, and that the distance between current and critical request is bounded for all 
items. Because of lack of space, we will deal with the other cases only in the full version. 
We thus obtain 

Theorem 2. Any projective list update algorithm has a competitive ratio of at least 
1.6 in the partial cost model, and this bound is best possible, as the algorithm COMB 
demonstrates. 

6 Conclusion 

An open problem is to extend this result to the full cost model, even though this model 
is not very natural in connection with projective algorithms. This would require request 
sequences over arbitrarily many items, and it is not clear whether an approach similar 
to the one given here can work. 

Another ambitious goal is to further improve the lower bound in case of non- 
projective algorithms. Here, the techniques of the paper do not apply at all, and to 
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get improvements that are substantially larger than the ones obtainable with the methods 
of |*5f| requires substantial new insights. 

Finally, the search for good non-projective algorithms has become an issue with 
our result. Irani’s SPLIT algorithm liTTunr is the only one known of this kind with a 
competitive ratio below 2. A major obstacle for finding such algorithms is the difficulty 
of their analysis, because pairwise methods are not applicable, and other methods (e.g. 
the potential function method) have not been studied in depth. We hope that our result 
can stimulate further research in this direction. 
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Abstract. We study the problem of strong/weak bisimilarity between processes of 
one-counter automata and finite-state processes. We show that the problem of weak 
bisimilarity between processes of one-counter nets (which are ‘weak’ one-counter 
automata) and finite-state processes is DP-hard (in particular, it means that the 
problem is both NP and co-NP hard). The same technique is used to demonstrate 
co-NP-hardness of strong bisimilarity between processes of one-counter nets. 
Then we design an algorithm which decides weak bisimilarity between processes 
of one-counter automata and finite-state processes in time which is polynomial 
for most ‘practical’ instances, giving a characterization of all hard instances as a 
byproduct. Moreover, we show how to efficiently compute a rather tight bound 
for the time which is needed to solve a given instance. Finally, we prove that the 
problem of strong bisimilarity between processes of one-counter automata and 
finite-state processes is in P. 



1 Introduction 

In concurrency theory, processes are typically understood as (being associated with) 
states in transition systems, a fundamental and widely accepted model of discrete sys- 
tems. Formally, a transition system is a triple T = {S, S, — >■) where S' is a set of states, 
Z' is a finite set of actions (or labels), and -^QSxSxSisa transition relation. We 
write s A f instead of (s, a,t) and we extend this notation to elements of S* in 
the natural way. A state t is reachable from a state s iff there i&w G S* such that s ^ t. 
A system T is finite-state iff the set of states of T is finite. 

The equivalence approach to formal verification of concurrent systems is based on 
the following scheme: One describes the specification (the intended behaviour) S and 
the implementation I of a given system in some ‘higher’ formalism whose semantics is 
given in terms of transition systems, and then it is shown that S and X are equivalent. 
Actually, there are many ways how to capture the notion of process equivalence (see, e.g., 
iTSi ). It seems, however, that bisimulation equivalence is of special importance, 

as its accompanying theory has been developed very intensively. Let T = (S', 27, — :>) 
be a transition system. A binary relation i? C S x S is a bisimulation iff whenever 
(s, t) G R, then for each s A s' there is some t t' such that (s', f') G R, and for each 
t t' there is some s A s' such that (s', t') G R. States s, t are bisimulation equivalent 
(or bisimilar), written s ^ t, iff there is a bisimulation relating them. Bisimulations can 

* Supported by the Grant Agency of the Czech Republic, grants No. 201/98/P046 and 
No. 201/00/0400. 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 317-El 2000. 
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also be used to relate states of different transition systems; formally, two systems can 
be considered as a single one by taking their disjoint union. An important variant of 
bisimilarity is weak bisimilarity introduced by Milner in his work on CCS |E|. This 
relation distinguishes between ‘external’ and ‘internal’ computational steps, and allows 
to ‘ignore’ the internal steps (which are usually denoted by a distinguished action r) to 
a certain extent. Formally, we define the extended transition relation ^ C S x S x S 
as follows: s ^ f iff f is reachable from s via a finite (and possibly empty) sequence 
of transitions labelled by r (note that s ^ s for each s), and s ^ t where a ^ t iff 
there are states u, v such that s m — >■ u f. The relation of weak bisimulation is 
defined in the same way as bisimulation, but ‘=>’ is used instead of . Processes s, t 
are weakly bisimilar, written s ^ t, iff there is a weak bisimulation relating them. To 
prevent a confusion about bisimilarity and weak bisimilarity, we refer to bisimilarity as 
strong bisimilarity in the rest of this paper. 

In this paper we study the complexity of checking strong and weak bisimilarity 
between processes of transition systems generated by (certain subclasses of) pushdown 
automata and processes of finite-state systems. A pushdown automaton is a tuple V = 
(Q, r, S, S) where Q is a finite set of control states, C is a finite stack alphabet, S 
is a finite input alphabet, and 5 : {Q x F) ^ 2^^ (‘3^^ ) is a transition function 
with finite image. We can assume (w.l.o.g.) that each transition increases the height (or 
length) of the stack at most by one (each PDA can be efficiently transformed to this 
kind of normal form). To V we associate the transition system where Q x li* is 
the set of states, S is the set of actions, and the transition relation is determined by 
(p,Aa) A {q,Pa) iff (a, (g,/3)) £ S{p,A). As usual, we write py instead of (p,y) 
and we use e to denote the empty word. The size of V is the length of a string which 
is obtained by writing all elements of the tuple linearly in binary. The size of a process 
pa of is the length of its corresponding binary encoding. Pushdown processes (i.e., 
processes of pushdown automata) have their origin in theory of formal languages |Q, 
but recently (i.e., in the last decade) they have been found appropriate also in the context 
of concurrency theory because they provide a natural and important model of sequential 
systems. In this paper we mainly concentrate on a subclass of pushdown automata where 
the stack behaves like a counter. Such a restriction is reasonable because in practice we 
often meet systems which can be abstracted to hnite-state programs operating on a 
single unbounded variable. For example, network protocols can maintain the count on 
how many unacknowledged messages have been sent, printer spool should know how 
many processes are waiting in the input queue, etc. Formally, a one-counter automaton 
M is a pushdown automaton with just two stack symbols I and Z\ the transition function 
(5 of M is a union of functions 5z and 5i where 6z ■ {Q x {Z}) — {z})) 
and Si : {Q X {/}) — >■ ) Hence, Z works like a bottom symbol (which 

cannot be removed), and the number of J’s which are stored in the stack represents the 
counter value. Processes of A (i.e., states of Tff) are of the form pPZ. In the rest of 
this paper we adopt a more intuitive notation, writing p{i) instead of pPZ. It is worth 
to note that the size of p{i) is 0(i) and not O(logi), because p{i) is just a symbolic 
abbreviation for paZ where a is a string of i symbols I. Again, we assume (w.l.o.g) that 
each transition increases the counter at most by one. A proper subclass of one-counter 
automata of its own interest are one-counter nets. Intuitively, OC-nets are ‘weak’ OC- 
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automata which cannot test for zero explicitly. They are computationally equivalent 
to a subclass of Petri nets with (at most) one unbounded place. Formally, a one- 
counter net A/^ is a one-counter automaton such that whenever (a, qPZ) G Sz{p, Z), 
then (a, ) S (p, /) . In other words, each transition which is enabled at zero-level 

is also enabled at (each) non-zero-level. Hence, there are no ‘zero-specific’ transitions 
which could be used to ‘test for zero’ . 

Observe that the out-going transitions of a OC process q{i) where i > 0 do not 
depend on the actual value of i. Hence, the structure of transition systems which are 
associated with OC-automata (and, in particular, with OC-nets) is rather regular — they 
consist of a ‘zero pattern’ and a ‘non-zero pattern’ which is repeated infinitely often. 
Despite this regularity, some problems for OC-automata (and even for OC-nets) are 
computationally hard, as we shall see in the next section. 

Now we give a short summary of relevant results for PDA and OC automata. The 
decidability of strong bisimilarity for processes of stateless PDA (which are also known 
as BPA processes) is due to Q. Another (incomparable) positive result is ©] where it is 
shown that strong bisimilarity is decidable for processes of OC-automata. These results 
have been recently extended to general PDA in 1 1 711 . The problem of weak bisimilarity 
is still open for all of the mentioned (sub)classes. The decidability of strong/weak bisim- 
ilarity between processes of a (general) class C and finite-state ones has been studied 
in [Q. It is shown that the problem can be reduced to the model-checking problem for 
a temporal logic EF and processes of C. Since EF is decidable for PDA processes, it 
suffices for showing fhe decidabilify, but the obtained algorithm is not very efficient — we 
only obtain EXPTIME upper-bound in this way for both strong and weak bisimilarity. 
Recently, PSPACE lower-bound for the problem of strong (and hence also weak) bisim- 
ilarity between PDA and FS processes has been given in Ha. A somewhat surprising 
result is CDl which says that strong and weak bisimilarity between BPA processes and 
finite-state ones is in P. OC-nets are studied, e.g., in [TO where it is shown that simula- 
tion equivalence (which is coarser than strong bisimilarity) is decidable for processes of 
OC-nets, and in ® where a close relationship between simulation problems for OC-nets 
and the corresponding bisimulation problems for OC-automata is established. 

In this paper we concentrate on the complexity of checking strong and weak bisim- 
ilarity between processes of OC-automata and FS processes. Our motivation is that the 
specification or the implementation of a system which is to be verified (see above) can 
offen be specified as a finife-sfafe process. Moreover, a number of ‘classical’ verification 
problems (e.g., liveness, safety) can be easily reduced to the problem of weak bisimilar- 
ity with a finite-state system. For example, if we want to check that the action a is live 
for a process g (i.e., each state which is reachable from g can reach a state which can 
emit a), we can rename all actions of g except a to r and then check weak bisimilarity 
between g and / where / is a one-state process with the only transition f f- 

In Section 121 it is shown that the problem of weak bisimilarity between processes 
of OC-nets and FS processes is DP-hard, even for a fixed finite-state process (intu- 
itively, the class DP M is expected to be somewhat larger than the union of NP and 
co-NP; however, it is still contained in the A 2 = level of the polynomial hierar- 
chy). Here we have to devise a special technique for encoding, guessing, and checking 
assignments of Boolean variables in the structure of OC-nets. As transition systems 
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which are associated with OC-nets are rather regular, the method is not straightforward 
(observe that assignments are easy to handle with a stack; it is not so easy if there is 
only (one) counter at our disposal). Using the same technique we also show that strong 
bisimilarity between processes of OC-nets is co-NP-hard (strong bisimilarity between 
processes of OC-automata and finite-state processes is already polynomial — see below). 
Assuming the expected relationship among complexity classes, the DP-hardness result 
for weak bisimilarity actually says that any deterministic algorithm which decides the 
problem requires exponential time in the worst case. Rather than trying to establish 
DP-completeness, we turn our attention to a more ‘practical’ direction — in Section 3 
we design an algorithm which decides weak bisimilarity between a process p{i) of a 
OC-automaton A and a process / of a finite-state system T in time 0{n^ m? (i -f 1)) 
where n is the size of A, m is the size of T, and z is a special constant which depends on 
A. So, if there was no z, or if z was always ‘small’, the problem would be in P. However, 
z can be much (exponentially) larger than n in general. However, it follows from the way 
how z is defined that the automaton must be very perverse to make its associated z large 
(a good example is the automaton constructed in the DP-hardness proof of Section 2). 
Hence, we conclude that our algorithm is actually efficient for many (if not all) practical 
instances, giving a sort of ‘characterization’ of all hard instances as a byproduct. Another 
advantage of our algorithm is that we can efficiently estimate the time which is needed 
to solve a given instance — although the computation of z for a given automaton A may 
take exponential time in general, we can efficiently (i.e., in polynomial time) compute 
a quite reliable bound for z. All hard instances are efficiently recognized in this way; it 
can also happen that some ‘easy’ instance is incorrectly declared as hard, but we argue 
that such situations are quite rare. The algorithm also works for strong bisimilarity, but 
in this case it only needs polynomial time — we obtain (as a simple consequence) that 
the problem of strong bisimilarity between OC processes and finite-state ones is in P. 
Proofs which were omitted due to space constraints can be found in [10]. 

2 Lower Bounds 

In this section we show that the problem of weak bisimilarity between processes of 
OC-nets and finite-state processes is DP-hard (even for a fixed finite-state process), and 
that the problem of strong bisimilarity between processes of OC-nets is co-NP-hard. 

Theorem 1. The problem of weak bisimilarity between processes of one-counter nets 
and finite-state processes is T)P-hard. 

Proof For purposes of this proof, we first fix the following finite-state system 




We show DP-hardness by reduction of the DP-complete problem Sat-Unsat. An 
instance of the Sat-Unsat problem is a pair (i^i, (^2) of Boolean formulae in CNF. 
The question is whether is satisfiable and 1^2 unsatisfiable. First, we describe a 
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polynomial algorithm which for a given formula ip in CNF constructs a one-counter net 
and its process s,^(0) such that ip is satisfiable iff Sy,(0) ~ Pi, and ip is unsatisfiable 
iff 5,^(0) ~ P 2 , where Pi, P 2 are the (fixed) FS processes of the system T. It clearly 
suffices for our purposes, because then we can also construct a one-counter net M 
by taking the disjoint union of and adding a new control state s together 

with transitions sZ s^^Z,sl ^ s^p^l and aZ ^ s^p^Z,sI ^ s^p^I (the non-zero 
transitions are added just to fulfil the constraints of the definition of OC nets). Clearly 
{ipi,ip 2 ) is a positive instance of the Sat-Unsat problem iff s(0) ~ P where P is the 
fixed FS process of the system T. 

In our proof we use the following theorem of number theory (see, e.g., 10): Let tt^ 
be the prime number, and let / : N — f N be a function which assigns to each n 
the sum Then / is 0{n^). (In our case, it suffices to know that the sum is 

asymptotically bounded by a polynomial in n.) With the help of this fact we can readily 
confirm that the below described construction is indeed polynomial. 

Let ip = Cl A ■ ■ ■ A Cm be a formula in CNF where Ci are clauses over propositional 
variables xi, •••, a;„. We assume (w.l.o.g.) that for every assignment 1 / : 

{true, false} there is at least one clause Ci such that v{Ci) = true (this can be 
achieved, e.g., by adding the clause (a;i V ^Xi) to p). Furthermore, we also assume that 
ip is not a tautology, i.e., there is at least one assignment v such that v{ip) = false 
(it actually means that there is a clause where no variable appears both positively and 
negatively). The construction of A/}, = (Q, {/, Z}, (a, b, c, rj, 6) will be described in a 
stepwise manner. The sets Q and S are initially empty. For each clause Ci, 1 < i < m, 
we do the following: 

- We add new control states Ci and qi to Q. Moreover, for each variable xj and each k 
such that 0 < fc < 7Tj we add to Q a control state {Ci, Xj , k) (observe that for each 
Ci we add 0{n'^) states in this way). 

- We add to S the transition qil A qil. 

- For each 1 < j < n we add to 6 the transitions c^/ A (A, Xj,0)I and Cil A ql. 

- For all j, k such that 1 < j < n and 0 < fc < tTj we add to S the transition 
{Ci, Xj, k)I A {Ci,Xj, {k + 1) mod •Kj)e. 

- For all j, k such that 1 < j < n and 0 < fc < we add to 6 the ‘loop’ 
{Ci,Xj,k)I A {Ci,Xj,k)I. 

- If a variable Xj does not appear positively in a clause A, then we add to 5 the loop 

(A,aA)^ A (A,A,0)^- 

- If a variable Xj does not appear negatively in a clause A, then we add to <5 the loops 
(A) Xj,k)Z A {Ci,Xj,k)Z for every 1 < k < iTj. 

If we draw the transition system which is generated by the current approximation of 
A/},, we obtain a collection of Gi graphs, 1 < * < m; each Ci corresponds to the 
‘subgraph’ of 7A which is obtained by restricting Q to the set of control states which 
have been added for the clause A - The structure of Gi is shown in the following picture. 
(To understand this figure, observe that transition systems associated to OC-automata 
can be viewed as two-dimensional ‘tables’ where column-indexes are control states and 
row-indexes are counter values (0, 1,2,.. .). As the out-going transitions of a state q{i) 
where i > 0 do not depend on the actual value of i, it suffices to depict the out-going 
transitions at zero and (some) non-zero level.) 
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Now we give several important facts about Gi (each fact easily follows from the 
previous ones): 

- For each ^ > 0 we have that Ci{l) ^ {Ci, Xj, k){0) iff I mod ttj = k. 

- For each I > 0, the state Ci{l) ‘encodes’ the (unique) assignment vi defined by 
vi{xj) = true iff Ci{l) ^ {Ci, Xj, 0)(0); conversely, for each assignment v there 
is ( G N such that v = vi (for example, we can put I = Ff”=o/(j)’ where f{j) = Wj 
if v{xj) = true, and f{j) = 1 otherwise). 

- For each ^ > 0 we have the following: 

• vi{Ci) = false iff Ci(() « C for the state C of the system T. Indeed, if 
vi{Ci) = false, then Ci{l) cannot reach any of the ‘zero-states’ where the 
action c is disabled — it can only emit c’s (possibly with some intermediate r’s) 
without a possibility to terminate. 

• vi{Ci) = true (i.e., the clause Ct is true for vi) iff Ci{l) ~ C for the state C 
of the system T. This also explains the role of the control state qi — we need 
it to make sure that the transition C ^ D can always be matched. 

We finish the construction of A/”^ by connecting the Gi components together. To do that, 
we add two new control states and r to Q, and enrich S by adding the transitions 

A- Si^IZ, Sipl A- s^I I, Si^I A- rl, and rl -A Cil for every 1 < i < m. The 
structure of 7^^ is shown below. 
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Now we can observe the following: 

- For each / > 0 we have that 

• = true iff r{l) ~ A for the state A of the system T. To see this, realize 
that vi(ip) ~ true iff vi{Ci) — true for each 1 < i < m iff Ci(l) ~ C for 
each 1 < i < TO due to the previous observations. 

• vi{tp) = false iff r{l) k, A for the state A of the system T. A proof is similar 
to the previous case; here we also need the assumption that at least one clause 
of is true for v\ (so that we can be sure that the transition A -A C can be 
matched by r(Z)). 

- is unsatisfiable iff 5(^(0) ~ Pi for the state Pi of T . Indeed, S;^(0) can perform its 
=5> move only by going to some (arbitrary) r{l). If ip is unsatisfiable, then vi{p) = 
false for each such r{l), hence all successors of s,^(0) are weakly bisimilar to A 
(see above), hence (0) ~ P 2 - If is satisfiable, then there is amove s,^(0) r(l) 
for some I such that iyi{p) = true, hence r{l) Ri A and r{l) A. Therefore, Pi 
cannot match the Sc^(O) r{l) move and thus Sc^(O) ^ Pi. 

- p is satisfiable iff s<p(0) ~ P± for the state Pi of T . It is checked in the same 

way as above. Here we use the assumption that p is not a tautology, i.e., S;^(0) can 
always choose an assignment which makes p false (i.e., s,^(0) can always match 
the transition Pi A A). □ 

The main reason why we could not extend the hardness result to some higher complexity 
class (e.g., PSPACE) is that there is no apparent way how to implement a ‘stepwise- 
guessing’ of Boolean variables which would allow to encode, e.g., the Qbf problem; 
each such attempt resulted in an exponential blow-up in the number of control states. 
However, we can still re-use our technique to prove the following: 

Theorem 2. The problem of strong bisimilarity between processes of one-counter nets 
is co-NP-fiarr/. 

Proof. We use a similar construction as in the proof of Theorem ^ Given a formula p 
in CNF, we construct two one-counter nets Af,Af and their processes s(0), s(0) such 
that p is unsatisfiable iff s(0) ^ s(0). The net Af is just a slight modification of the 
net AA of Theorem Q] — we only rename all r-labels to c. A key observation is that 
p is unsatisfiable iff after each sequence of transitions of the form c*a (i.e., after each 
choice of an assignment) there is a 6-transition to a state which can only emit an infinite 
sequence of c actions without a possibility to terminate (i.e., at least one clause is false 
for any assignment). The net A6 is a ‘copy’ of Af but we also add a new control state q 
and transitions ql A ql, fl -A ql where f is a ‘twin’ of the state r of M^p. We put s and 
s to be the corresponding twins of the state s^p of J\fp. Now we can easily check that p 
is unsatisfiable iff s(0) ~ s(0) — the crucial argument is stated above. □ 

3 Efficient Algorithms 

In this section we design an algorithm which decides weak bisimilarity between pro- 
cesses of OC-automata and finite-state processes. As expected, the algorithm requires 
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exponential time in the worst case. However, it works rather efficiently for many (and we 
believe that almost all) ‘practical’ instances. It also works for strong bisimilarity where 
it needs only polynomial time. 

Let T = {S, S, — ?>) be a transition system. For each i S No we define the relation 
Kii inductively as follows: nio = S x S', and s ~i+i t iff sK.it and for each s ^ s' 
there is some t ^ t' such that s' t', and vice versa. (The Ki relations are also used to 
relate states of different transition systems; formally, we consider two transition systems 
to be a single one by taking their disjoint union.) Our algorithm relies on the following 
theorem established in f7‘] : 

Theorem 3. Let Q = (G, S, — >■) be a (general) transition system and T = (F, 27, — >) 
a finite-state system. Wfe say that a state g G G is good w.r.t. i G Nq iff there is f G F 
such that g Ki f; g is bad w.r.t. i iff g is not good w.r.t. i. 

Let g G G, f G F, and k = \F\. It holds that g ~ f iff g / and each state which 

is reachable from g is good w.r.t. k. 

For the rest of this section we fix a one-counter automaton A = (Q, {/, Z}, 27, S) of 
size n, and a finite-state system T = (F, 27, — >■) of size to. 

To decide weak bisimilarity between processes p(i) of A and / of F, it suffices (by 
Theorem E} to find out if p(i) f and whether p(i) can reach a state which is bad 
w.r.t. TO. We do that by constructing a constant z such that for each state q(j) of 7 )a 
where j > (4to -I- l)z we have that q(j) <l{j — z). In other words, each state of 7)4 
is (up to represented by another (and effectively constructible) state whose counter 
value is bounded by (4 to -|- 1)z. Then we convert this ‘initial part’ of 7)a to a finite-state 
system F 4 and construct the relation between states of F 4 and T . The question 
if p(i) / is then easy to answer (we look if the representant of p(i) within F 4 is 
related with / by ~m)- The question if p(i) can reach a state which is bad w.r.t. to still 
requires some development — we observe that states which are bad w.r.t. to are ‘regularly 
distributed’ in 7 )a and construct a description of that distribution (which is ‘read’ from 
Fjx) in a form of a finite-state automaton Ai which recognizes all bad states. Then we 
use an algorithm of |4l| which constructs from A4 an automaton Ai' recognizing the set 
of all states which can reach a state recognized by Ai, and look whether Ai' accepts 
p{i). All procedures we use are polynomial in the size of Fy\. Hence, it is only the size of 
z which can make the problem computationally hard. The construction of z can require 
exponential time; however, we give an algorithm which efficiently (i.e., in polynomial 
time) computes a reliable upper bound Z for z. 

Intuitively, the only difference between processes p{i),p{j), where i ^ j, is the 
way how they can access the ‘zero level’. As long as the counter remains positive, each 
process can ‘mimic’ moves of the other process by entering the same control state and 
performing the same operation on the counter. Observe that the counter can be generally 
decremented by an unbounded value in a single step (due to an unbounded number 
of r-transitions). The next definitions and lemmata reveal a crucial periodicity in the 
structure of 7)4 which shows that decrementing the counter ‘too much’ in one =§» step is 
not the thing which allows to demonstrate a possible difference between p(i ) , p(j). 

For each Z G No we define a family of binary relations =^;, a G S, over the set of 
states of 7)4 as follows: p{i) q{j) iff there is a sequence of transitions from p{i) to 
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q{j) which forms one move and the counter value remains greater or equal I in all 
states which appear in the sequence (including p{i) and q{j)). 

Definition 1. VPe define a function stepj^ : Q ^ 2^ by stepj^{p) = {q G Q \ p{2) 
9(1)}- 

Since the reachability problem for one-counter automata (and even for pushdown au- 
tomata) is in P, the function step^^ is effectively constructible in polynomial time. As 
the out-going transitions of a state p{i) for z > 0 do not depend on the actual value of i, 
for each z G N we have that q G stepj^{p) iff p{i + 1) q{i). 

We extend stepj^^ to subsets of Q by stepj^{M) = UpeM ^^^Pa{p )- eachp G Q 
we now define the sequence Cp inductively as follows: C'p(l) = {p} and Cp{i + 1) = 
stepj^{Cp{i)). The next lemma is easy to prove by a straightforward induction on z. 

Lemma 1. For all p G Q and i,j G N we have that q G Cp{j) ijfp{i + j) q{i). 

Another simple observation is that the sequence Cp is (for every p G Q) of the form 
Cp = apfdp where ap, (3p are finite sequences of pairwise different subsets of Q (due to 
the assumption that the elements of ap and (3p are pairwise different we also have that 
ap and (3p are unique). Note that (3p can also consist of just one element 0. We define 
the prefix and period of p, denoted pre{p) and per{p), to be the length of ap and f3p, 
respectively. Now we put 

z = max{pre(p) | p G Q} ■ lcm{per(p) | p G Q} 

where lcm(M) denotes the least common multiply of elements of M. As we shall see, 
max{pre(p) | p G Q} is always However, lcm{per(p) | p G Q} can be exponen- 
tial in n (for example, examine the net constructed in the proof of TheoremlO. As we 

already mentioned, the size of z is the only thing which can make the considered problem 
hard. Hence, we obtain a kind of ‘characterization’ of all hard instances — OC-automata 
which are presented in hard instances must contain many ‘decreasing r-cycles’ of an 
incomparable length. Also observe that the construction of z can require exponential 
time, because per{p) for a given p can be exponential in n (in the end of this section 
we show how to compute a reasonable upper bound Z for z efficiently). The following 
lemma is immediate: 

Lemma 2. For all p G Q and i > z we have that Cp{i) = Cp{i + z). 

The next three lemmata provide a crucial observation about the structure of Ta. and 
precisely formulate the intuition that ‘decreasing the counter too much in one => step 
does not help’. Proofs can be found in ITDtl . 

Lemma 3. For all p G Q and j G N it holds that 

— if there is a sequence of t - transitions from p(j 2z) to (some) q(l) which decreases 
the counter to j at some point, then p(j -f z) ^ 9(0’' 

- if there is a sequence of t - transitions from p(j -\- z) to (some) q(l) which decreases 
the counter to j at some point, then p(j 2z) ^ q{l)i 
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Lemma 4. For allp G Q and j GN it holds that 

— if there is a sequence of transitions forming one ‘=^ ’ move from p{j + 4z) to ( some) 
q{l) which decreases the counter to j at some point, then p{j + 3z) =4> q{l); 

— if there is a sequence of transitions forming one ‘=5> ’ move from p{j + 3z) to (some) 
q{l) which decreases the counter to j at some point, then p(j + 4z) ^ q{l); 

Lemma 5. Letp G Qandk G Nq. For eachc > (4,k+l)z we have that p{c) p(c—z). 

Now we are almost in a position to prove the first main theorem of this section. It remains 
to extend our equipment with the following tool: 

Definition 2. Let V = (Q,r,S,S) be a pushdown automaton, M. = (S, F,"f, F) a 
nondeterministic finite-state automaton (note that the input alphabet of M is the stack 
alphabet ofV), and Init : Q ^ S a total function. A process pa ofV is recognized by 
the pair (M, Init) iff f(Init(p) , a) n 7^ 0 where 7 is the natural extension of 'y to 
elements of S x F*. 

The next theorem is taken from |01. 

Theorem 4. Let V = (Q, F, E, S) be a pushdown automaton, A 4 = (S, T, 7, F) 
a finite-state automaton, and Init : Q ^ S a total function. Let N be the set of 
processes recognized by (M,Init). Then one can effectively construct an automaton 
M' = (S, F, 7', F) in time 0(|(5| • jiS'p) such that (Ai' , Init) recognizes the set 

Pre*(N) = {q(3 \ q/3 — >■* pa for some pa S N} 

of all predecessors of N. 



Theorem 5. The problem of weak bisimilarity between processes p(i) of A and f of F 
is decidable in 0(n^ mfi z^ (i + 1)) timeQ 

Proof. By Theorem 13 we need to find out whether p(i) Rim / and whether p(i) can 
reach a state which is had w.r.t. m. Due to Lemma^lwe know that the set of all states of 
Ta up to Rim can be represented by the subset of states of Fa where the counter value 
is at most (4m + l)z. Formally, we first define the function B : (Q x Nq) — >■ (Q x Nq) 
as follows (where (q,j) is just another notation for q(j)): 

({q,j) ifj<(4m + l)z; 

(g, (4m+l)z) if j > (4m + l)z and (j mod 2 :) = 0; 

[ {q, 4mz + (j mod z)) if j > (4m + l)z and (j mod z) F 0- 

An immediate consequence of Lemma |3 is that for all g S Q and j G Nq we have 

q(j) Rim B(q(j)). Now we define a finite-state system Fa = {Fa^ F, '^) where Fa is 
the image of B (i.e., Fa = {q{j) | 9 G 0 < j < (4m + l).z}), S is the set of actions 

* Note that we need a non-constant time even in the particular case when i = 0 (the problem is 
still DP-hard). That is why we write ‘i -|- 1’. 
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of A, and ‘-A is the least relation satisfying the following: if r(k) A s(l) is a transition 
of Ta, then B{r{k)) ^ Observe that Ta is actually the ‘initial part’ of 7a'^ 

the only difference is that all up-going transitions of states at level (4m + l)z are ‘bent’ 
down to the corresponding Rim-equivalent states at level 4mz + 1. Note that for each 
q{j) we still have that q{j) Rim lS{q{j)) (when B{q{j)) is seen as a state of At). The 
number of states of Ta is 0{n m z); moreover, the number of out-going transitions at 
each ‘level’ of Ta is 0{n), hence the size of is 0{n m z), which means that the total 
size of Ta is also 0{n m z). 

Now, let us realize that if we have a finite-state system of size t, it takes 0{t^) time 
to compute the associated ‘A’ relation (for each state s and action a we need 0{t) time 
to compute the set {r | s A r}). Therefore, we need 0{n^ nA z^) time to construct the 
extended transition relations for Ta and T. To compute the Rim relation between the 
states of T A and T, we define 72.° = Fa x F, and 72*+^ = Exp{TT) where the function 
Exp : {Fa X F) ^ {Fa x F) refines its argument according to the definition of Ri^ 
— a pair {r{j),g) belongs to Exp{TZ) iff it belongs to 72 and for each ‘A’ move of 
one component there is a corresponding ‘A’ move of the other component such that the 
resulting pair of states belongs to 72. Clearly, for each pair {r{j),g) of Fa x F we have 
that r{j) Rim g iff (’"(j)) 3 ) S 72"*. It remains to clarify the time costs. The function 
Exp is computed m times. Each time, 0{nnT z) pairs are examined. For each such 
pair we have to check the membership to Exp {TV). This takes only 0{nm^ z) time, 
because the extended transition relations have already been computed. To sum up, we 
need 0{iA nT z°) time in total. 

To check if p{i) Rim f, we simply look if {B{p{i), /)) G 72"*. It remains to find out 
whether p(i) can reach a state q{j) which is bad w.r.t. m. Observe that q{j) is bad w.r.t. 
m iff the state B{q{j)) of Ta is bad w.r.t. m. Therefore, we can easily construct a finite- 
state automaton A4 and a function Init such that the pair (A4 , Init) recognizes the set of 
all bad states of 7(4 — weputA4 = {S , {I ^ Z} , ^ ^ {fin}) where S' = {fin}VJ{p{i) \p G 
Q,Q <i < (4to + l)z} and 7 is the least transition function satisfying the following: 

- p{i + 1) G ^{p{i),I) for allp G <5, 0 < 7 < (4m + l)z\ 

- p{Amz + 1) G 7 (p(( 4 m -f l)z),I) for each p G Q; 

- if a state p{i) of Ta is bad, then fin G 'j{p{i), Z). 

The function Init is defined by Init{p) = p(0) for allp G Q. Note that A4 has 0{n m z) 
states. Now we compute the automaton Ai' of Theorem01(it takes 0{n^ m z) time) and 
check if {A4' , Init) recognizes p{i). This can be done in 0(nm z (z + 1)) time because 
Ai' has the same set of states as Ai. 

We see that 0{n^ mfi z° (z + 1)) time suffices for all of the aforementioned proce- 
dures. □ 

Our algorithm also works for strong bisimilarity in the following way: If we are to decide 
strong bisimilarity between p{i) and /, we first rename all r-transitions of A and T with 
some (fresh) action e (it does not change anything from the point of view of strong 
bisimilarity, because here the r-transitions are treated as ‘ordinary’ ones). As there are 
no T-transitions anymore, there is no difference between strong and weak bisimilarity, 
hence we can use the designed algorithm. Also observe that if there are no r’s then 
z = 1 , so we can conclude: 
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Corollary 1. The problem of strong bisimilarity between processes p{i) of A and f of 
T is in P. 

As we already mentioned, the construction of z can take exponential time. Now we show 
how to compute a rather tight upper bound Z for z in polynomial time. The associated 
lemmata and proofs can be found in m- 

Theorem 6. We say that p G Q is self-embedding iff p G Cp{i) for some i > 2. Let 
us define Z = (|Qp + |<5|) • lcm{per(p) \ p G Q is self-embedding }. Then Z can be 
computed in time which is polynomial in n. Moreover, z < Z. 
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Abstract. Strong bisimilarity of Basic Parallel Processes (BPP) is decidable, but 
the best known algorithm has non-elementary complexity [ij;]. On the other hand, 
no lower bound for the problem was known. We show that strong bisimilarity of 
BPP is co-Af"P-hard. 

Weak bisimilarity of BPP is not known to be decidable, but an MV lower bound 
has been shown in ED- We improve this result by showing that weak bisimilarity 
of BPP is i7| -hard. 

Finally, we show that the problems if a BPP is regular (i.e., finite) w.r.t. strong and 
weak bisimilarity are co-A/”P-hard and 772 -hard, respectively. 



1 Introduction 

Bisimulation equivalence plays a central role in the theory of process algebras 
The decidability and complexity of bisimulation problems for infinite-state systems has 
been studied intensively (see for a survey). While many algorithms for bisimulation 
problems have a very high complexity, only few lower bounds are known. 

The state of the art. Strong bisimilarity of two Petri nets and weak bisimilarity of 
a Petri net and a finite automaton is undecidable lOllIl- Weak bisimilarity for Basic 
Parallel Processes (BPP) is A/^P-hard and weak bisimilarity for context-free processes 
(BPA) is PS PACE -hard ED. However, it is still an open question whether these prob- 
lems are decidable. 

Some lower bounds for decidable bisimulation problems have been shown in E3- 
Strong (and weak) bisimilarity between pushdown automata (PDA) and finite automata 
is PSPACE -hard, finiteness of PDA w.r.t. weak and strong bisimilarity also PSPACE- 
hard. Finally, both strong bisimilarity of Petri nets and finite automata and finiteness of 
Petri nets w.r.t. strong bisimilarity are EXPSPACE-hard. (See the table in Section|3for 
a summary of all results on the complexity of bisimulation problems.) 

Basic Parallel Processes (BPP) were introduced by Christensen as the fragment 
of CCS Ea without communication, restriction and relabeling. They are equivalent 
to communication-free nets L8L the subclass of Petri nets |]ZS1 where every transition 
has exactly one input-place with arc-weight one. While strong (and weak) bisimilar- 
ity are undecidable for Petri nets m, strong bisimilarity is decidable for BPP (i.e., 
communication-free nets) m. However, the algorithm in m has non-elementary com- 
plexity and, to the best of our knowledge, no better algorithm has been found since then. 
In spite of this, no lower bound for the problem has been found either. 
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However, there is a polynomial algorithm for bisimilarity on the restricted subclass 
of normed BPP m. (A process is normed iff from every reachable state there is a termi- 
nating computation.) Thus, it was conjectured that a polynomial algorithm should also 
exist for general (unnormed) BPP. This belief was reinforced by the fact that many other 
problems for BPP are polynomial: boundedness o, termination, liveness, (partial) 
deadlock reachability and (partial) livelock reachability ED El- (On the other hand 
there are also hard problems for BPP: reachability is A/^P-complete (Si], some model 
checking problems are P SPACE -complQtQ II2(JI 1221 or even undecidable 0-) 

Our contribution. We show that strong bisimilarity for BPP is co-A/^P-hard (thus 
proving the above mentioned conjecture wrong). We also show that weak bisimilarity 
for BPP is iT^-hard, thus improving a previously established J\fV lower bound II31I . 
Finally, we show that the problem if a BPP is regular (i.e., hnite) w.r.t. strong and weak 
bisimilarity is co-AA7^-hard and ill-hard, respectively. 

2 Definitions 



Let Act = {a,b,c, . . .} and Const = {e, X,Y, Z, . . ,}he disjoint countably inhnite sets 
of actions and process constants, respectively. The class of general process expressions 
G is defined hy E ::= e | X | ii||ii | E.E, where X € Const and e is a special 
constant that denotes the empty expression. Intuitively, is a sequential composition 
and ‘II ’ is a parallel composition. We do not distinguish between expressions related by 
structural congruence which is given by the following laws: and ‘||’ are associative, 

‘II’ is commutative, and ‘e’ is a unit for ‘.’ and ‘||’. 

A process rewrite system (PRS) m is specified by a finite set A of rules which have 
the form E A- F, where E, F € G, E ^ e and a G Act. Const{A) and Act{A) denote 
the sets of process constants and actions which are used in the rules of A, respectively 
(note that these sets are finite). Each process rewrite system A defines a unique transition 
system where states are process expressions over Const (A). Act{A) is the set of labels. 
The transitions are determined by A and the following inference rules (remember that 
‘II’ is commutative): 

{E A F)g A E A E' E A E' 

eAf e.fAe'.f e\\f a E'\\F 

We extend the notation E A F to elements of Act* in a standard way. Moreover, we 
say that F is reachable from E if E A F for some w € Act*. 

Various subclasses of process rewrite systems can be obtained by imposing certain 
restrictions on the form of rules. To specify those restrictions, we first define the classes S 
and P of sequential and parallel expressions, composed of all process expressions which 
do not contain the ‘|| ’ and the ‘.’ operator, respectively. We also use ‘1 ’ to denote the set 
of process constants. The hierarchy of process rewrite systems is presented in Fig. ID the 
restrictions are specified by a pair {A, B), where A and B are the classes of expressions 
which can appear on the left-hand and the right-hand side of rules, respectively. This 
hierarchy contains almost all classes of infinite state systems which have been studied 
so far; BPA (Basic Process Algebra, also called context-free processes), BPP (Basic 
Parallel Processes), and PA-processes are well-known [(0, PDA correspond to pushdown 
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PRS (G,G) 




PAD (S,G) PAN (P,G) 




PDA (S,S) PA (1,G) PN (P,P) 




BPA (1,S) BPP (1,P) 




FS (1,1) 

Fig. 1. A hierarchy of PRS 



automata (as proved by Caucal in |0), PN correspond to Petri nets, PRS stands for 
‘Process Rewrite Systems’, PAD and PAN are artificial names made by combining 
existing ones (PAD = PA+PDA, PAN = PA+PN). 

Here we study Basic Parallel Processes (BPP) that correspond to process rewrite 
systems of type ( 1 , -P) . 

We consider the semantical equivalences weak bisimilarity and strong bisimilarity 
121 over transition systems generated by PRS. 

Definition 1. The action r is a special ‘silent’ internal action. The extended transition 

relation ‘=5> ’ is defined by E ^ F iff either E = F and a = t, or E ^ E' E" ^ F 
for some i, j G INp, E' , E" G G. A binary relation R over process expressions is a weak 
bisimulation iff whenever {E, F) G R then for every a G Act: if E E' then there 
is F ^ F' s.t. {E', F') G R and if F 4 F' then there is E ^ E' s.t. {E’, F') G R. 
Processes E, F are weakly bisimilar, written E k: F, iff there is a weak bisimulation 
relating them. Strong bisimulation is defined similarly with — ^ instead of Processes 

E, F are strongly bisimilar, written E ^ F, iff there is a strong bisimulation relating 
them. The largest (strong or weak) bisimulation is an equivalence relation. 

Bisimulation equivalence can also be described by bisimulation games lEOIEl 
between two players. One player, the ‘attacker’, tries to prove that two given processes 
are not bisimilar, while the other player, the ‘defender’, tries to frustrate this. In every 
round of the game the attacker chooses one process and performs an action. The defender 
must imitate this move and perform the same action in the other process (possibly together 
with several internal r-actions in the case of weak bisimulation). If one player cannot 
move then the other player wins. The defender wins every infinite game. Two processes 
are bisimilar iff the defender has a winning strategy and non-bisimilar iff the attacker 
has a winning strategy. 
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3 Hardness of Strong Bisimilarity for BPP 



Strong bisimilarity of BPP 
Instance: Two BPP processes Pi and P2- 

Question: Pi ^ P 2 7 

This problem has been shown to be decidable in m . However, the algorithm relies on 
Dickson’s Lemma for termination and therefore the algorithm is not primitive recursive. 
A polynomial algorithm for bisimilarity on the restricted subclass of normed BPP has 
been described in m, which led to the conjecture that the general problem was also 
polynomial. We prove this conjecture wrong by proving a co-AfP lower bound. Thus 
no polynomial algorithm for strong bisimilarity of BPP can exist, unless V — J\fV. 

First we give an intuition why the general (unnormed) problem is so hard, us- 
ing the terminology of communication-free Petri nets. The problem if a place in a 
communication-free net is unbounded (i.e., if there are reachable states that put ar- 
bitrarily high numbers of tokens on it) is easily decidable in polynomial time ini- 
However, it is not so easy to determine if the number of tokens on a place really matters 
w.r.t. bisimilarity, i.e., if states with different numbers of tokens on this place are really 
different w.r.t. bisimilarity (i.e., non-bisimilar). First we consider the simple example Z\: 
X A X\\Y^ X A e, A e, Z A Z. The process (X, Z\) is infinite w.r.t. bisimilarity 
(since it has infinitely many non-bisimilar reachable states). However, (X||Z, A) is fi- 
nite w.r.t. bisimilarity, since (X||Z, A) ^ (Z, Z\). We say that in the process (X||Z, Z\) 
the subprocess (Z, Z\) masks the infiniteness of (X, A). In particular, the subprocess 
Z has the effect that the number of subprocesses Y doesn’t matter for bisimilarity, 
since (Y'^\\Z,A) ^ {Y'^\\Z, A) for any n,m G N. Now consider the new system 
A' := A U {Z A e}. The process (X||Z, Z\') is infinite w.r.t. bisimilarity, because 
(X||Z, A') A (X, A'). We say that by this transition the subprocess X is unmasked. 
Of course, this is only a very trivial example of masking and unmasking. In general the 
question if a process can be unmasked (i.e., if a place matters w.r.t. bisimilarity) is J\fV- 
hard. Later in this section we use a more complex example of masking and unmasking 
to prove this. 

For the subclass of normed BPP, finiteness w.r.t. bisimilarity coincides with bounded- 
ness. Thus for normed BPP finiteness w.r.t. bisimilarity is decomposable into properties 
of subprocesses and decidable in polynomial time. In particular, for normed BPP, the 
parallel composition of two infinite processes yields an infinite process. For general BPP 
it is different. The parallel composition of infinite processes (w.r.t. bisimilarity) can yield 
a process that is finite w.r.t. bisimilarity. Thus, finiteness (or infiniteness) w.r.t. bisimilar- 
ity of a BPP process cannot be decomposed into properties of subprocesses in general. 
The following example shows this. Let A be 



Tti > Yi 

x„ ^ x„||F„, 



Then the process Xi IIX2II . . . ||X„ is finite w.r.t. bisimilarity, but every subprocess (e.g. 
X3 1 1X4 1 1X7) is infinite w.r.t. bisimilarity. 

Now we are ready to prove the co-AfP lower bound for strong bisimilarity of BPP. 
We do this by a polynomial reduction of 3-S AT to the negation of the problem. Let n G IN 
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and let xi, . . . , Sji be boolean variables. A literal is either a variable or the negation of 
a variable. A clause is a disjunction of 3 literals. Let Q := Qi A ... A Qk be a boolean 
formula in 3-CNF over xi, . . . ,Xn with k clauses. We construct BPPs Pi and P 2 s.t. Q 
is satisfiable iff Pi / P 2 - The set of transition rules A is defined as follows. 

For every* S {1, . . . , n} wehave ^ Xi+i||o!i where a* is aparallel composition 
of constants defined as follows: For every j G {1, ... ,k} let Aj be in iff the first 
literal of Qj is Xi. For every j G {1, ... ,k} let Bj be in ai iff the second literal of 
Qj is Xi. For every j G {1, . . . , fc} let Cj be in cti iff the third literal of Qj is Xi. For 
every * G {1, . . . , n} we have Xi ^ Xij^i \\!3i where (3i is a parallel composition of 
constants defined as follows: For every j G {1, . . . , A:} let Aj be in f3i iff the first literal 
of Qj is Xi. For every j G {1, . . . ,k} let Bj be in /3i iff the second literal of Qj is x*. 
For every j G {1, . . . , fc} let Cj be in Pi iff the third literal of Qj is Xi. The intuition is 
that by action xpXi one chooses the value true! false for the variable Xi. Q is satisfiable 
iff the assignment of values to the variables can be chosen in such a way that for every 
j G {1, . . . , A:} at least one of the constants {Aj, Bj,Cj} does not appear. 



The other transition rules are as follows: 


^3 




for 1 < J < A: 




%B, 


for 1 < J < A: 


Cj 




for 1 < j < k 


X, 


^x. 


for l<i<n+l,l<j<k,djG {oj, b. 


Xn+1 




■■\\Yk 


Xj 


-4 Zj+i 


\\Wj for 1 < j < A:, dj G {aj,bj,Cj} 


w, 




for 1 < j < A:, 1 < Z < j, di G {ai,bi,ci} 


Zj 


di 

e 


for 1 < J < A:, dj G {uj, bj, Cj} 


Zk+1 


Zk+i 





Figure 0 gives a rough description of A in Petri net notation. Let Pi := {Xi\\Zi, A) 
and P2 := {Xi,A). 

Lemma 2 . If Q is satisfiable then Pi / P 2 - 

Proof. We show that the attacker has the following winning strategy. Since Q is satisfi- 
able, there exists an assignment of variables that makes Q true. The attacker can choose 
this assignment by performing the corresponding actions Xi or for 1 < * < n in either 
Pi or P 2 . Then the attacker does the action e. The defender can only respond by doing 
exactly the same. This yields the new states P( and P 2 with P[ = P 2 II.Z 1 . For every 
j G {1, . . . , A:} there is at least one constant Dj G {Aj, Bj,Cj} that does not appear in 
P{ or P2. Let dj be the action corresponding to Dj, e.g. if Di = Bi then di = bi. 

The attacker performs the action di by the rule Zi ^ e in P{. Since neither Di nor 
Zi occurs in P 2 the defender can only respond by Yi % Z2HW1. Let the resulting states 
be P" and Pf . Now the attacker performs Z 2 ^ e in Pf to which the defender can 
only respond by Y 2 ^ ^ 311^2 in P" and so on with Zj , dj for 1 < J < fc. In the end 
the defender is forced to perform the transition Yk ^ Zk+i\\Wk. Now the action A is 
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enabled in one process (by the constant Z^+i), but not in the other. Thus the attacker 
can win and Pi 7 ^ P 2 ■ □ 



Lemma 3. If Q is not satisfiable then P\ ^ P 2 . 

Proof. Let AS (for ‘assignments’) be the set of subterms containing only constants 
Aj , Bj , Cj for 1 < J < fc. We call a term t € AS a faulty assignment iff there is at least 
one m G {1, . . . , fc} s.t. all three constants A^, B^, Cm occur in t. We call the minimal 
such m the index of t, denoted ind{t). Let FAS be the set of faulty assignments. Since 
Q is not satisfiable, every assignment t that is created by performing one of each pair 
of actions xi/xi . . . is a faulty assignment. Any incomplete assignment f that is 

created by an incomplete prefix of choices from xfxi . . . xjlxj (with j < n) must in 
the end become a faulty assignment once all choices from xi/xi . . . have been 

made. Let IFASj be the set of these incomplete faulty assignments created by choosing 
one of each pair of actions xfxi . . . Xjlxj. Let Oj be the set of terms containing only 
constants Yi,Wi, Zi , Zi+i with I < j. To keep the notation simple we define Wq := e. 
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The symmetric closure of the following relation is a bisimulation. 

{{X.WtWZ't, Xi\\t\\Zl) I 1 < i < n A f G IFASi-i A w,r; G Nq} U 
{(X„+i||f||Zr, Xr,+i\\t\\Zl) \ t€FAS A A 

{{Y 4 ...\\Yk\\t\\Z'i, Y 4 ...Yk\\t\\Zl)\t€FAS A tx,r;G]No}U 
{(y,|| . . . \\Yu\\W,_ 4 t\\l. Y,+,\\ . . . \\Y,\\t\\Z,+,\\W,\W) I 
t G FAS A j + 1 < ind{t) A 7, 7' G Oy-r} U 
{(y,ii . . . iiniif||iT,_iii 7 , F,+iii . . . m\m,\\i) I 
t G FAS A j + 1 < ind{t) A 7, 7' G Oy-r} U 

{(y,+i|| . . . ||n||f||z,+i||iT,|| 7 , y,+i|| . . . WYkWtWW.Wi) \ 

t G FAS A j + 1 < ind{t) A 7, 7' G Oy-r} U 

{(iy,llz]Virr+ill • ■ • \\Yk\h\\t, vp,-||^]Vil|y,+i|| . . . Iliiciiyiif) | 

M G {0, 1} A f G FAS A 1 < j < fc A 7, 7' G Oj-i} 

Since {Xi\\Zi, Xi) is in this relation, we get Pi ^ P2- □ 

Theorem 4. Strong bisimilarity of BPP is co-AfP-hard. 

Proof. Directly from Lemma| 2 |and Lemma 0 and the A/^P-completeness of 3 -SAT. 

Note that both Pi and P2 are bounded, i.e., they have only hnitely many reachable 
states. It is easy to see that in general the number of reachable states of P1/P2 is 
exponential in the size of the description of A. Moreover, the number of reachable 
states of Pi / P2 is even exponential up to strong bisimilarity, i.e., they generally have an 
exponential number of non-bisimilar reachable states. Let t G FAS. Analogously to the 
dehnition of the index of t we dehne ind'{f) as the maximal m s.t. all three Am, Bm, Cm 
appear in t. Consider the reachable states X„+i||fi||f2, where fi||f2 G FAS encodes 
a faulty assignment and the constants Aj, Bj,Cj in ti have j < ind'{ti \\t2) and the 
constants Aj, Bj, Cj in t2 have j > ind' (ti\\t2) . In particular ind'{ti\\t2) = ind'{ti). 
While the particular structure of f 1 does not matter for bisimilarity (as long as f 1 G FAS), 
the structure of ^2 does. We have X„+i||fi||f2 / -^n-i-ipi||f2 every ^ ^2- Since 
there are in general exponentially many different such it follows that P1/P2 is at 
least exponential w.r.t. strong bisimilarity. Thus, our construction does not yield a lower 
bound for the problem of strong bisimilarity of a BPP and a hnite-state process (with 
polynomially many states). It seems to be impossible to prove a lower bound for this 
asymmetric problem, since whenever one encodes a sufficiently complex problem (e.g. 
SAT) into a BPP, this BPP is never bisimilar (neither strongly nor weakly) to any finite- 
state system of polynomial size (although it can be bisimilar to a finite-state system of 
exponential size). Thus, we conjecture that strong and weak bisimilarity of a BPP and 
a finite-state system is decidable in polynomial time. (Is is known that strong and weak 
bisimilarity of a normed BPP and a hnite-state system is polynomial ITEl l. 

Now we consider the strong hniteness problem. 

Strong finiteness of BPP 
Instance: A BPP process P. 

Question: Does there exist a hnite-state system F s.t. P ^ F 1 

Finiteness w.r.t. strong bisimilarity is decidable even for general Petri nets d, and 
this result carries over immediately to communication-free nets (i.e., BPP). However, the 
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algorithm in (H consists of two semidecision procedures and gives no upper bound on 
the complexity. For general Petri nets one gets an EXPSPA CE lower bound by reducing 
the problem if a given place can ever become marked to the finiteness problem. For gen- 
eral Petri nets the problem if a given place can ever become marked is EXPSPA CE -hard 
C3E1 . For communication-free nets (i.e., BPP) this is different. While the reachability 
problem is A/^P-complete for communication-free nets |*S|, it is easy to see that the prob- 
lem if a given place can ever become marked in a communication-free net is polynomial. 
Thus, one does not obtain a lower bound for the strong finiteness problem of BPP that 
way. 

It is clear that a constructive solution to the problem, i.e., constructing the finite-state 
system F if it exists, must require at least exponential time. This is because there are 
BPPs s.t. the smallest finite-state system F that is bisimilar to them has an exponential 
number of states (in the size of the description of the BPP). However, it is not immediately 
clear if a simple yes/no answer to the strong finiteness problem must be as hard. The 
following theorem shows this. 

Theorem 5. Strong finiteness of BPP is co-AfV-hard. 

Proof. By a polynomial reduction of 3-SAT to strong infiniteness. Let the formula Q 
and the set of rules A be defined as before and let A' ■.= A \J {Xn+i A Xn+i\\Zi}. 
We show that the process Xi w.r.t. the set of rules A', denoted (Xi, A'), is infinite 
w.r.t. strong bisimilarity iff Q is satisfiable. 

<J= If Q is satisfiable then there are infinitely many reachable states Yi|| ... || Yfc||7||Z7* 
for every m G Ng, where 7 is a term that encodes a satisfying assignment of Q. 
This means that 7 is a parallel composition of constants Aj , Bj , Cj where for every 
j G { 1 , . . . , fc} at least one of the constants Aj , Bj , Cj does not occur in 7. However, 
for every mi m 2 we have Li|| . . . ||yfc||7||-^i"^ / ^i|| • ■ • 11^*11711^™^’ because 
the attacker has a winning strategy similar to the one in Lemma|3 Thus {Xi , Z\') is 
infinite w.r.t. strong bisimilarity. 

^ LetZ\" := AU{Xn+i X„_|_i}. The process (Xi, Z\") has finitely manyreachable 
states. (However, (Tfi, A”) has an exponential (in the size of Z\") number of non- 
bisimilar reachable states.) If Q is not satisfiable then (Xi, A') ~ {Xi, A”) and is 
thus finite w.r.t. bisimilarity. The bisimulation relation is the same as in LemmaOl 

□ 

The previous construction shows that the problem if a place can be unmasked (i.e., 
made to count w.r.t. bisimulation) is A/^P-hard. Here this particular place was Z\. 



4 Hardness of Weak Bisimilarity for BPP 

Weak bisimilarity of BPP 
Instance: Two BPP processes Pi and P2- 

Question: Pi P 2 ? 

It is still an open question if this problem is decidable. It has been shown to be 
semidecidable in |:Bl, using the facts that weak bisimulation equivalence on BPPs is 
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semilinear (since it is a congruence on a finitely generated commutative semigroup) and 
that it is decidable if a given semilinear relation on a BPP is a weak bisimulation. An 
AfV lower bound for this problem has been shown in lITl (by reduction of a variant 
of the bin-packing problem), and the co-AfV lower bound of Theorem 0 carries over 
immediately to weak bisimilarity. Here we prove a TTl -lower bound (in the polynomial 
hierarchy) that subsumes these results. 

Let Q Qi A . . . A Qfe be a boolean formula in 3-CNF over the boolean variables 
Xi, . . . ,Xn,yi, ■ ■ ■ yVn with k clauses. We construct BPP processes Pi, P 2 s.t. Pi Ki P 2 
iff y{xi , . . . , Xn)3{yi , . . . , yn) Q- Since this problem is TTf -complete, we get a TTf- 
lower bound for the problem of weak bisimilarity. 

Let ai be a parallel composition of constants in {Qi , . . . ,Qk} s.t. constant Qj 
appears in ai iff Xi makes clause Qj true (i.e., Xi appears positively in Qj). Let Pi be a 
parallel composition of constants in {Qi , . . . , Qt} s.t. constant Qj appears in Pi iff Xi 
makes clause Qj true (i.e., Xi appears negatively in Qj). Let 7 ^ be a parallel composition 
of constants in {Qi , . . . , Qfc} s.t. constant Qj appears in 7 ^ iff yi makes clause Qj true. 
Let Si be a parallel composition of constants in {Qi , . . . , Qk} s.t. constant Qj appears 
in Si iff yi makes clause Qj true. The set of transition rales A is defined by 



A 


A A+i \\o-i 


for 1 < i < n 


A 


A XiJ^i WPi 


for 1 < i < n 


A' 




for 1 < i < n 


A' 


^A'+illA 


for 1 < i < n 


Xn+l 


Aa||...||A. 




K+1 


Aa||...||A. 




Y' 

^n+1 


A A 




A 


A 7j 


for 1 <i <n 


A 


— >■ Si 


for 1 < i < n 


Qj 


A Qj 


for 1 < j <k 


z 


A A 


for 1 < j < A: 



Let Pi := (Xi,A) and P 2 := (X'i,A). 

Lemma 6. If'i{xi , . . . , a;„)3(?/i, . . . , ?/„) Q is false then Pi P 2 . 

Proof If V(a;i,...,a;„)3(yi,...,j/„)(5 is false then 3(a;i, . . . , a:„)V( 2 /i, . . . , j/„)-'(5. 
The attacker chooses these values for xi, . . . ,Xnhy choosing Xifxi. The defender can 
only copy these moves. Then the attacker chooses the transition X^_^_i A Z. The 
defender can only respond by X^+i A YiH . . . ||F„ and then a sequence of silent r- 
actions ending in a state t. By definition of A and since 3(xi, . . . , Xn)'^{yi, • ■ ■ , yn)~'Q 
there will be at least one action qj (with 1 < j < k) that is not enabled by t (and cannot 
made to be enabled by r-moves). However, all qj are enabled by Z. Thus, the attacker 
has a winning strategy and Pi P 2 . □ 

Lemma 7. If'i{xi , . . . , a;„)3(?/i, ...,yn)Q then Pi k, P 2 . 

Proof The attacker can choose the assignment for xi, . . . , cc„. The defender can only 
imitate these choices. If the attacker chooses the transition Xn+i A Li|| • ■ ■ \\Yn or 
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^n+i ^ 1 II ■ ■ • II ^Ti then the defender can respond in such a way that the two processes 
become equal and the defender wins. If the attacker chooses A Z then the 

defender can (by a long internal move of r-actions) choose the values for , . . . , on 
his side. Since V(xi , . . . , a:„)zl(yi, . . . , y„) Q there are choices for yi, . . . , s.t. in the 
resulting state all actions qi, . . . ,qi~ are permanently enabled. Since qi, . . . ,qj^ are also 
permanently enabled by Z in the other process and all other actions are not, the defender 
wins. Thus, the defender has a winning strategy and P\ k, P 2 . □ 

Theorem 8 . Weak bisimilarityof BPP is Ill^ard. 

Proof. Directly from Lemma^and Lemma Q 

Weak finiteness of BPP 

Instance: A BPP process P. 

Question: Does there exist a finite-state system F s.t. P Ki F 1 

We show that the weak finiteness problem for BPP is also ill-hard by using the 
previously defined processes Pi and P 2 and constructing a new process P that is weakly 
finite iff Pi ^ P 2 - Let Z\' be Z\ U F, where F is the following set of transition rules: 
/ A/IIC /Ae CAe D-^E D ^ E' D ^ Xi\\S 

D^X[\\S E^E E'^E' S A Allis' E' ^ X[\\S SAS 

LetP := (i||D,Z\'). 

Lemma 9. If Pi 76 P 2 then P is not weakly finite. 

Proof. P has infinitely many non-weakly-bisimilar states D 1 1 C* for alH G N. It suffices 
to show that Zi||C''’ 9 ^ D||C* for j > i. The attacker has the following winning strategy. 
He does action c exactly i + 1 times in D\\C^ and reaches the state . The 

defender can respond in different ways in D\\C^, but the reached state will always be 
either E\\C’^ or S'ljC^ for some k < i. In the first case the attacker does the transition 
D A A(||S. The defender can only respond hy E A Ai||S and the new state in the 
bisimulation game is (AiHSIIC-’”*”^, AiHSIIC^). This is not weakly bisimilar, because 
Pi P 2 . The second case is symmetric with Xi and exchanged. □ 

Lemma 10. If Pi K. P 2 then P is weakly finite. 

Proof. Let P' be P where X[ is replaced by Xi and A” := AU P' . Since Pi r; P 2 and 
weak bisimilarity is a congruence on BPP, we get P = (/||D, A') Ri {I\\D, A"). It is 
easy to see that (/||D, Z\") {E, A"), because S A- S. Thus P Ri {E, A"). However, 

{E, A”) has only finitely many reachable states. □ 

Theorem 11. Weak finiteness of BPP is H^-hard. 

Proof. By Lemmas IB 1711^ and hll □ 

5 Conclusion 

The following table summarizes known results about the complexity of bisimulation 
problems for several classes of infinite-state systems. New results are in boldface. The 
different columns in the table below show the results about the following problems: 
strong bisimilarity with finite automata, strong bisimilarity of two infinite-state systems, 
weak bisimilarity with finite automata and weakbisimilarity of two infinite-state systems. 
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^ F 




« F 




FS 


V lElETT 


V El 171 


V OE71 


V IHE71 


BPA 


V EH 


G 2-EXPTIME 0 


V m 


PSPACE -hard |gT| 


PDA 


p EXP TIME ITsl 
PSPACE -ha.rd gs) 


decidable I29II 
PS PACE -hard |23 


G EXPTIME ITll 
PSPACE-hatd |23l 


PSPACE-hatd ED 


BPP 


G PSPACE iQ 


decidable Ifl 

co-NP-hard 


G PSPACE ca 


nP-hard 


PA 


decidable ifT?! 


co-NP-hard 


decidable 11 51 


PSPACE-hard ED 


PAD 


decidable 111151 
PSPACE-hwd ED 


PSPACE -hard O 


decidable r~5l 
PSPACE-hatd O 


PSPACE-hard ED 


PN 


decidable 116111.51 
EXPSPACE-haxA 


undecidable 111 31 


undecidable IFIII 


undecidable 11 311 


PAN 


EXPSPACE-hMd 


undecidable 1131 


undecidable I13II 


undecidable I13II 


PRS 


EXPSPACE-hMd 


undecidable 111 31 


undecidable 11 311 


undecidable ll 311 



The following table summarizes results about the problems of strong and weak 
finiteness. New results are in boldface. 





strong finiteness 


weak finiteness 


BPA 


G 2-EXPTIME 0EI 


? 


PDA 


PSPACE-hard E3 


PSPACE-hard O 


BPP 


decidable II14I 
co-NP-hard 


n^-hard 


PA 


co-NP-hard 


n^-hard 


PAD 


PSPACE-hard O 


PSPACE-hard O 


PN 


decidable 11411 
EXPSPACE -hard 


undecidable II14I 


PAN/PRS 


EXPSPACE-hard 


undecidable IImI 



Some more results are known about the restricted subclasses of these systems that 
satisfy the ‘normedness condition’ (e.g. cacQiEsirziiii)- 
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Abstract. We show decidability of several hrst-order logics based upon 
the reachability predicate in PA. The main tool we use is the recognizabil- 
ity by tree automata of the reachability relation between PA-processes. 
This approach and the transition logics we use allow a smooth and gen- 
eral treatment of parameterized model checking for PA. Then the logic 
is extended to handle a general notion of costs of PA-steps. In partic- 
ular, when costs are Parikh images of traces, we show decidability of a 
transition logic extended by some form of first-order reasoning over costs. 



1 Introduction 

PA jHW9()j is the formal model of concurrency allowing recursive definitions, se- 
quential and parallel compositions, but where actions are uninterpreted and do 
not synchronize. PA gives rise to infinite-state systems and is quite expressive. 
Recently, several verification problems have been shown decidable for PA using 
a variety of fairly involved techniques 0 . Decidability results for PA are mainly 
theoretical, and they help delineate some important frontiers in the field of verifi- 
cation for infinite-state processes. In we advocated regular tree languages 

and tree automata as an easy-to-use tool for tackling problems about PA, and 
we proved that the reachability sets of a regular set of PA processes is regular. 
Our proofs are effective and give simple polynomial-time algorithms which can 
be used for a variety of problems based on reachability among PA-processes. 

showed how to use and adapt them for problems in static analysis. 

Contents of the paper. Here we extend our previous work in several ways: 

Recognizable tree relations: We replace automata for tree languages by au- 
tomata for tree relations and show that — > over PA-processes is recognizable. 
First-order transition logic: This gives a decision method for the first-order 
transition logic, i.e. the first-order logic having — >■, — >■, and equality as basic 

^ See, e.g.. lREH95b:BPH95alKucD6lRuc9VUKM98mj99lMav99l . 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 342-ESI 2000. 
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predicates (plus any other recognizable predicates). The method computes 
the set of solutions of a given formula, and thus allows parameterized model 
checking, model measuring, . . . 

Costs: We enrich PA with a notion of “cost of steps” which is more general 
than traces. These costs can encode various measures and view PA as a truly 
concurrent model (e.g., costs can encode timing measures where parallelism 
is faster than interleaving) . We extend the transition logic with decomposable 
cost predicates and show the decidability of several timed transition logics. 

Parameterized constraints over — >■: Finally, we define TLC, the transition 
logic where costs are the Parikh images of traces and where integer variables 
and Presburger formulas are freely used to state constraints on reachability. 
TLC is not decidable but we isolate a rich fragment which is. 



Related work. Several temporal logics with cost constraints have been proposed 
for finite state systems (see [AFLPf)f)lhlTL)f)| for recent proposals). Some frag- 
ments of the temporal logics from pBFHH5blfjFH95^ apply to PA but temporal 
logics deal with paths and are quite different from transition logics where a 
first-order theory of states is available (more explanations in section m . 

For costs that are Parikh images of traces, shows recognizability of 

the ternary relation s t over BPP (PA without sequential composition) but 
does not consider applications to the first-order transition logic. shows 

recognizability of the reachability relation between configurations of timed au- 
tomata, introduces the transition logic and uses it for model measuring and 
parameterized model checking. An important technical difference is that our au- 
tomata recognize pairs of trees (PA-processes) while handles tuples of 

integers (markings of BPP’s) and handles tuples of reals (clock values) . 

Reachability in PA is investigated in |May97|May99| . The underlying meth- 
ods apply to more general systems (like PBS pVTol96| ) hut they are quite complex 
since they view terms modulo structural congruence. As explained in jfjS99| . we 
believe it is better to only introduce structural congruence at a later stage. 

The combination of costs and tree automata is studied by Seidl |Sei94j for 
compiler optimization (using more general costs than ours), not decidability of 
logics on trees (where the relevant problem is the combination of cost automata). 



2 The PA Process Algebra 

PA may be defined in several (equivalent) ways. Our definition: 

(1) uses rewrite rules a la Moller jEEEni, 

(2) does not identify terms modulo structural congruence, 

(3) incorporates a notion of costs for steps, 

(4) is a big-steps semantics (in the sense of |Plo81j l. 

Syntax. We assume Act = {a, &,...} is a finite set of action names and write 
Act* = {w, . . . } for the set of words over Act (with empty string denoted e). 
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For wi,W 2 € Act*, we let Wi || W 2 denote their shuffle product, i.e. the set of all 
their interleavings. 

We assume M = {c, . . . } is a set of values, called costs, equipped with two 
binary associative-commutative operations © and © with same neutral element 
Om (we see several different examples of cost sets in sections |S1 0 . Given a set 
Const = {X, Y,Z, . . of process constants (or names), Tconst, or T when the 
underlying Const is clear, is the set {s, t, . . . } of PA-terms, given by the following 
abstract syntax 



s,t:-.= s.t \ s\\t \ O \ X \ Y \ Z \ ■■■ 

For t G T, Const{t) denotes the set of process constants occurring in t. A PA 
declaration is a finite Const with a finite set A C Const x Act x M x7~ of process 
rewrite rules. A rule {X, a, c,t) G A is written X t. For simplicity, we require 
that all X G Const appear in the left-hand side of at least one rule. 

For t G T, we let Sub{t) denote the set of all subterms of t. Similarly, we write 
Sub{A) for the finite set of all subterms of (some term from) A. The size of a term 
is |f| Card{Sub{t)) and the size of a PA declaration is |Z\| =' Card{Sub{A)). 



Semantics. A PA declaration A defines a labeled transition system (T, — >■) with 
— >-C 7” X Act* X M X T. We write s ^ t when (s,w,c,t) S— >■. The transition 
relation ^ is defined by the following SOS rules: 



(Re) 

(R'e) 

(Rs) 



0^0 












(Rc) 



(Rp) 



t^f 



if A ^ t e z\ 



tl 



% t[ t2 ^ t '2 



C t'2 



C.t2'^A.t2 (R's) - 






C.t2 t't' 



if ic G Wi II W 2 



if Const (t[) = 0 



The intuition formalized by s ^ t is that s can evolve into t by performing the 
sequence of actions w and the cost of that derivation is c. In general there may 
exist several different derivations between a s and a t: they may have same cost 
or not, and use same sequence of actions or not. For instance, if Z\ = {A ^ 

c! 

Y, X Y}, then the cost of reaching Y from A is either c or c'. 

We write s ^ t (or s t, or s t) when s ^ t for some c (resp. for 
some w, for some w, c). Even though we started with a big-steps semantics, it is 
convenient to have small-steps too and we write s — >■ t when s ^ t for some w 
of length 1. We also use — > defined as — > o — 

For t gT, the set Post*{t) = {F | t A- 1'} and Pre*{t) = {t' | f' A f} denote 
the set of iterated successors and (resp.) iterated predecessors of t. Post*{t) is 
also called the reachability set of t. 
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3 Tree Automata and Regular Cost Grammars 

Given a finite ranked alphabet T = iFp U iFi U • • • U Tm, Tj: denotes the set of 
terms (or finite trees) built from T . A free language is any subset L of Tj^. 

3.1 Tree Automata and Regular Tree Grammars 

Tree automata recognize sets of trees (see ICDG-l-9^ 1. Formally, a tree automa- 
ton is a tuple A — Q, S) where .7^ is a finite ranked alphabet, Q — {q\, ■ ■ ■ , } 
is a finite set of states, and 5 C U„gN(lF„ x QA x Q) is a finite set of transition 
rules. A rule (/, gi, . . . , g„, q) € S is usually written /(gi, . . . , gn) ' — > q and is 
read as “if the subterms ti, . . . ,tn of some t = /(fi, . . . , tn) haue been labeled by 
gi , . . . ,qn, then A may label t with g” . Given a term t, the automaton labels the 
nodes of t by states according to the rules in a bottom-up way. We write 1 1 — > q 
when t G Tjr may be rewritten into q G Q using the rules of S. 

Recognizability. When 1 1 — > g, we say t is recognized (also accepted) by state 
g. We write L(g) for {t \ t i-^ g} and say it is the tree language recognized by 
g. If we add a set F Q Q of final states to some tree automaton A, then the 
language recognized by A is L(A) = Uq^rL(q) = {t \ 3q G F,t g}. We say 
that L C Tj^ is recognizable if L = L{A) for some tree automaton. Recognizable 
tree languages are closed under union, intersection, complementation. 

e-rules. Tree automata with £-rules further allow rules of the form g i — > q' . 
These rules may be used anywhere inside t g". With the classical subset 
construction, tree automata with £-rules can be transformed into equivalent 
deterministic tree automata without £-rules. 

Top-down tree automata are structurally bottom-up tree automata but la- 
bel trees in a top-down way. The difference is that a rule is now written g — ^ 
/(gi, . . . ,g„) (or g' — ?> g for an £-rule) and that the final states are now called 
initial states. The most important difference between top-down and bottom-up 
automata is a difference in viewpoint. Top-down tree automata are often seen 
as regular tree grammars, i.e. as generators rather than acceptors. 

3.2 Recognizability of Post* (X) 

We now show that, for any PA-declaration A, and any X G Const, the set 
Post*{X) is a regular tree language and give at the same time a grammar for 
the costs of the derivations. This generalizes the result of to costs and 

uses a slightly different approach which yields the proof that the relation A- is 
recognizable. 

Given a cost set (M, ©,®), a regular cost grammar over M is a set C of 
non-terminals Ci,... together with cost rules — >■ En- where the E’s 

are simple right-hand sides of the form “C © C'” or “C © C"” (where C, C are 
non-terminals), or “c” (where c G M). Gost grammars describe subsets of M\ 
the rules define a set of recursive inclusions that admit a least fixpoint solution. 
For simplicity we write C for the subset of M defined by the non-terminal C 
and use shortcuts like C ^ C (£-rules) or C — >■ c © C" for grammar rules. 
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A regular tree grammar and a regular east grammar for Post*(X) 



h 

Qo 

Q'o 

Iy 

Qy 

Qy 

Q'y 

Qsi\\s2 

QL 11.^0 



^Si.S2 

Qsi.S2 

Qsi.S2 



0 

0 

0 

Y 

Y 

Qs 

Q's 

Is, I 
Qs, 
Q's. 



Is, -I 



^S2 

Qs2 

Q's2 



S2 

Qs, -Is2 

Q's,-Qs2 

Q's,-Q's2 



Cl 

fsQ 

'-’0 

Q' 



Co 

Cf 

C^' 

(^Q 

^Si]|s2 

rQ 

'^Sl|k2 

^Si.S2 

CQ 

^Si.S2 



c9' 



Om 

Om 

Om 

Om 

Om 

c©C? 
c © CC 



Cl. 



cl 

c9 



c2' © 



cl,®ci 
ci ® cl, 
c9l © eg 
c?: © eg' 



if 0 G Sub{A) 



for all Y G Sub (A) 



for all y -A s € A 



for all Si II S 2 G Sub{A) 



> for all S 1 .S 2 G Sub{A) 



These rules denote both a tree automaton Apost* , and a cost grammar Cposf 
Their relationship is stated in the following proposition: 



Proposition 3.1. For all t G Sub{A): 

1. It s iff s = t and t ^ s. Furthermore, Cf = {Om}- 

2. Qt ^ s iff t Af s. Furthermore, if t s then c G eg, and if c £ eg then 
t IA s' for some s' sueh that Qt A sb 

3. Q't s iff t lA s and s is terminated. Furthermore, if t -2^ s then c G eg , 
and if c £ eg then t s' for some terminated s' such that Q't ^ s' . 



4 Tree Automata and n-ary Relations 

Products of trees. We follow mm . Given two terms s,t £ Tjr, the pair (s, f) can 
be seen as one term over a product alphabet = (-^Ul-L}) x (•^U{-L})~l-L-L} 

where T is a new symbol with arity 0. In the arity of fg is the maximum 

of the arities of / and g. Formally we define s x t as the term in given 
recursively by 

^ ■ Fn^tji, Axtji.^1, . . . ,Axtjn) 

if n < m, 

/ 5 (sixti,... ,s„xt„,s„+ixT,... ,s„xT) 
otherwise. 

For instance the product f{a,g{b))xf{f{a,a),b) is //(a/(Ta, Aa),gb{bl.)). This 
definition is extended to products of n terms si x . . . x s„ in the obvious way. 
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Definition 4.1. A n-ary relation R C X ■ ■ • X Tjr is recognizable iff the set 
of all siX .. .X Sn for (si, . . . , s„) € R is a regular tree language. 

For instance Id “= {(s, s) | s G Tjr} is a recognizable relation and an automaton 
accepting Id needs only one state q (which is final) and rules ff{q , ... , g) ' — > q 
for all f G J-. The intersection, the union and the complement of recognizable n- 
ary relations are also recognizable. The main consequence is that the first-order 
theory of recognizable relations over finite trees is decidable, or, more precisely: 

Theorem 4.2. Let ,x„) be a first-order formula involving reeogniz- 

able relations Ri . and Sol (ip) denote {{ti, ... ffn) \ \= ■ ■ ■ ffn)}- Then 

Sol((f) is a recognizable subset of Tlf. Furthermore, from automata Ai, . . . rec- 
ognizing Ri, ... , one can build an automaton A,p recognizing Sol(ip). 

4.1 Recognizability of the Reachability Relation 

We can extend the proof that all Post*{X) are recognizable languages into a 
construction showing that the relation A- is recognizable, providing a top-down 
tree automaton for — >■. The main feature of the construction is to add three 
specific non-terminals I (for identity), R (for rewrite) and R' (for rewrite and 
termination) and other non-terminals Qx,s, Qj_,s, Q'x s Q'± s ® 

in Sub{A). The complete automaton is given in the full version of the paper. 

Proposition 4.3. For any A, the relation A between PA terms is a recognizable 
relation, and there is automaton with size 0(|Z\|) recognizing it. The relations 
A, — > and A for any w € Act* are recognizable. 

Since the image and the inverse image of a recognizable language via a recog- 
nizable relation is recognizable, the recognizability of — >■ implies the regularity 
theorems of fLShhj as a byproduct. However, having recognizable Post*{L) and 
Pre* {L) does not necessarily entail the recognizability of A. This can be illus- 
trated in the PA framework, as the following remark shows. 

Remark J^.^. An alternative definition of the reachability relation for PA is ob- 
tained by replacing the rule (R 5 ) from section El with 

ua ,, , ^ 

As) ' ^ if Const {ff) = 0 

ti.l2 r I 2 

(indeed, why not get rid of these useless terminated processes?). With this new 
definition, it is still true that, for regular L C ff, Pre*{L) and Post*{L) are 
regular tree languages, but the relation A is in general not recognizable. □ 

4.2 Costs for Reachability 

We can easily write simultaneously a cost grammar associated to the top-down 
automaton for — >■ which satisfies the following property: 



348 D. Lugiez and Ph. Schnoebelen 



Proposition 4.5. For any s,t G T: 

1. R 1 -^ s X t iff s ^ t. Furthermore, if s t then c G C^, and if c G 
then s t for some s, t such that R \ — > s x t. 

2. R' I — > s X t iff s — > t and t is terminated. Furthermore, if s t then 
c G , and if c G then s t for some s, t such that R' i — s x t. 

5 TL, the First-Order Transition Logic 

Assume A is fixed. The first-order transition logic TL is the first-order logic 
with process variables {u,v, . . .), the binary predicates =, — > and — any other 
recognizable relation like, for instance, u G P for P a regular tree language. Ob- 
serve that whether ti, . . . satisfies ip{ui, . . . ,Un) depends on the underlying 
PA declaration A. Theorem 14.21 yields the decidability of TL, or more precisely: 



Corollary 5.1. For any A and any TL formula v?(wi, . . . , u„), we can build an 
automaton recognizing Sol{ip) (a subset ofT^). 



Expressivity Since quantifiers can be used freely, and since equality and other 
predicates are available, TL is more expressive than the modal logic EF handled 
in |MavfE)f!.Shhl . For instance, the confluence of — > is expressed by the TL 



formula Vu, v, v' 



(u A- f A M A- u') => 3v" [v A v" A u' A v") 



Furthermore, the logic TL could be extended so that several PA declarations 
may be used simultaneously. E.g., we can state that the term reachable from X 
via A are the terms reachable via Ai followed by A 2 by the formula 



Vzi 



A — 



Process terms with free variables, e.g., {X.u) || (u.O), can also be used since 
the predicates encoding the functions symbols are recognizable. 



Parameterized verification and model measuring 

Computing Sol(ip) is more general than deciding validity, satisfiability, or 
model checking (telling whether t |= if{u) for a given t and ip). In model checking 
applications, being able to compute Sol{(p) under a suitable symbolic represen- 
tation (a tree automaton A,p in our case) gives a general approach that smoothly 
integrate and generalize parameterized verification and model measuring 

n copies of X 

Example 5.2. Write A" for A jj (A || (A • • • || A) . . . ): the sets L = {A” | n = 
0,1,2,...} and L' = {Y^ \ n = 0,1,2,...} are regular tree languages. Let 
ip{u,v) be some TL formula comparing the behaviors of u and v (e.g. p A 
Vz((u A z A z A z) u ^ z), “all loops of v are loops of m”). By computing 
Sol{u G L A V G L' A <p{u, u)) we can find which values of n make A" relate to 
some A™ (or to all of them) and we can also find which values of m make the 
property (p{u,Y'^) satisfiable (or valid) for the A"’s. □ 
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6 Reachability under (Decomposable) Constraints 

In this section, we enrich the basic reachability predicate used in TL and allow 
to write “decomposable” constraints upon the derivation costs. 

6.1 The Decomposable Transition Logic DTL 

In [LS99t we proved that reachability via a trace constrained by a regular word 
language is undecidable but it is decidable for “decomposable” languages (see 
also IMchhtiMolhbh . Here we define decomposable cost predicates along the same 
lines for unary cost predicate P. We write P{c) when P holds for c. 

Definition 6.1. A finite set 'DP of cost predicates is a decomposable family if 

seq-decompositions: for all P G 'DP there is a finite index set I and a family 
{Pf,Pf e VP \ i G 1} s.t. for all c, c' e M, P(c©c') iff Vie/ 
par-decompositions: for all P G VP there is a finite family {Pl,Pf G VP \ 
i G 1} s.t. for all c, c' G M, P{cS) c') iff Pl(c) A P^{c'). 
unit-decompositions: for all P G VP and all costs c appearing in A, there is a 
finite family {Pf G VP \ i G 1} s. t. for all d G M, P(c©c') iff \J Pf{d). 

A predicate P is decomposable if it belongs to a decomposable family. DTL 
(Decomposable Transition Logic) is the first-order logic that extends TL hj 

allowing all atoms u — ^ — > v where P is any decomposable predicate. 

3c P(^c'\ c 

u — ^ — > V is short for “3c, u v A P^df” and holds iff there is a derivation 
u V such that P{c) holds. Observe that we require that any cost variable be 
immediately quantified upon (like the freeze quantification of EHMI), hence the 

3P 

cost variable c is always bound and we sometimes simply write u — > v. 

3P 

Using the tree automata approach, we can show that the relation — > is recog- 
nizable for any decomposable predicate P, which yields the decidability theorem: 

Theorem 6.2. The logic DTL is decidable. 

We now look at instances of DTL where costs measure some form of timing. 

6.2 The Timed Transition Logic TTL 

TTL (Timed Transition Logic) is the first-order logic that extends TL by allow- 
ing all atoms u v where r is a time constraint built according to the grammar: 
T ::= c < C I -ir I T A T 

where the C’s can be any numerical constant from a time domain T that can be 
N, or Q+, or M+ (and where c is the free cost variable of r). Since these time 
constraints can be expressed by decomposable predicates, we have 

Proposition 6.3. For any time constraint t, the relation s t is recognizable. 
allowing the following instantiation of Theorem lti.2l 
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Theorem 6.4. The logic TTL is decidable. 

TTL can be enriched with the predicate u meaning that going 

from u to V may take arbitrarily long time: 

T ^ rrn 1 I ■ Unbounded . ■ i i 

Lemma o.o. I fie relation > is recognizable. 



7 The Transition Logic with Constraints TLC 

In this section, we add a first-order logic of costs to the transition logic. The 
resulting two-sorted logic, called TLC, is very expressive and allows parameter- 
ized verification with parameters ranging over costs rather than “processes” . For 
TLC, costs are Parikh costs and the cost c = (ni, . . . , rip) of a derivation s ^ t 
records the number of occurrences of each action of Act along the derivation. 
Since c is now a p-tuple of integers, we often write xi, . . . ,Xp or x instead of c. 
The formula over costs are Presburger formula. This allows to state properties 
as “s ^ t with as many actions a as actions h" . 

TLC allows two kinds of atoms: all i?(rii, . . . ,n„) for R a recognizable re- 
lation as in TL and all u — y where V'(S,j/) is a Presburger formula 
whose free variables are partitioned into x a tuple of p integer variables, for 

the cost of the derivation and the arbitrary parameters y. u — v is short 
for “3c, u V A ip{c, y)” . Observe that only u, v, y are free in u v. The 

negation of u — y can be written \/x{u u => ip'{x,y)) where ip' is -i'0, 
another Presburger formula, and we shall use this notation freely. TLC formulas 
are given by the abstract syntax: 

ip ::= Atom \ p Ap \ ->p \ 3up \ 3yp 

Since the full TLC is undecidable (Prop. 17.111 we introduce two fragments that 
will be shown decidable (Theo.IOjl: 

the parameterized existential fragment: which is the set of closed formulas 
that can be written under the form (3|V y)*(3 u)*[V A Atoms] , and 
the parameterized universal fragment: which is the set of closed formulas 
that can be written under the form (3|V j/)*(V u)*[V A -•Atoms]. 

The restriction on the polarity of atoms only applies only to reachability 

3x'ip{x,y) , * 

atoms u — — — > V with some non-empty y hence u — > u, etc., can be negated 
freely. Observe that one fragment only contains formulas that are (equivalent 
to) the negation of a formula of the other fragment. 

7.1 The Difference between TLC and Temporal Logics 

Temporal logics do not have a mechanism for identifying states precisely and 
relate them. On the other hand, they can refer to a given path, state properties 
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that hold along this path, and relate paths. When we write u v in TLC, 

we state that u may go to v via a path having some cost properties, but we 
cannot isolate this path and refer to it again. Additionally, temporal modalities 
are recursive by nature, while transition logics only have the fixpoint built-in 
— As a consequence, simple temporal modalities like EF can be expressed in 
the transition logic but more complex constructions like E U cannot. 

The ability to refer to states is the specific feature of transition logics: these 
logics can distinguish bisimilar processes. And the rich language for constraints 
allows to express many counting properties that occur naturally in verification. 



7.2 Expressing Properties with TLC 

We consider the communication protocol example used in |BEH951^ , where they 
focus on the two actions req (for requests) and ack (for acknowledgments). 

TLC can state that, between an initial state and a final state, a protocol 
sends as many acknowledgments ack as requests req : 



Vu, V {u £ Init Av£ Final) => {^Xack, Xreq{{u v) => Xack = Xreq) 

where “Xacfc = Xreq' is the V'(^) constraint. (We assumed that Init and Final 
are regular tree languages.) 

T LC can also state that at any position along a path from an initial state to 
a final state, the number of sent requests is always greater than or equal to the 
number of received acknowledgments. This is specified as follows: 



^ ( u £ Init A V 

Vrt, v,w \ . 

^Xackt Xreq\ 



u £ Init A V £ Final A u 

^ack j^req 

u — — — > w 



^req 



.W—i’V \ 

^ Xack^ j 



TLC can further impose that all traces belong to req* ack*. We add to the 
previous formula the formula stating that if a state is reached by emitting an 
acknowledgment, then all paths from this state to a final state contain no request. 

u £ Init Av £ Final A {u \ y;) 

^ yXack,Xreq{w y ^ ^req = O) 

Observe that these three examples are all in the parameterized universal frag- 
ment of TLC and could be enriched by using parameters. 




7.3 Decidability Issues for TLC 

The first result is a negative one. 

Proposition 7.1. TLC is undecidahle, even when restricted to the fragment 
without parameters and with only 3V quantification for process variables. 

This result prompted the introduction of the parameterized fragments of TLC. 
Let <P = (3|V y)*(3 u)*(f be a parameterized existential TLC formula. Let 
, j/fc) denote the “(3 u)*(p” part and Sol{4>) = {(ni,... ,nk) \ \= 

(t>{ni,... ,nk)}. 
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Theorem 7.2. Sol{4>) is an effectively computable semilinear set. 

The proof of this result uses an extension of tree automata which combines a 
cost grammar with the usual tree automata rules. This class of automata is closed 
under product and union and the reachability of a state is decidable. Moreover 
in our case, the set of costs associated to a state is an effectively computable 
semilinear set and the relation — >■ is accepted by an automaton with cost. 

Theorem 7.3. The parameterized existential and the parameterized universal 
fragments ofTLC are decidable. 



8 Conclusion 

The recognizability of A- extends our earlier results on reachability sets. This 
also opens new directions for automat a-theoretic approaches to the verification of 
PA-processes, since being able to compute the set of solutions of a transition logic 
formula allows a smooth and general approach to the verification of parameter- 
ized properties for parameterized systems. Additionally, the automata-theoretic 
approach relies on quite simple constructions. The consequence is that we can 
easily extend it in various ways, as we demonstrated with reachability under de- 
composable cost predicates, with various timed extensions of the transition logic, 
and with TLC where both PA-processes and Parikh costs can be constrained 
via parameterized formulas. An important goal for future work is to analyze the 
computational complexity of the various ideas we proposed. This should help 
understand what cost sets and what decomposable predicates can be handled in 
practice, and what restrictions may be fruitfully imposed on transition logics so 
that they remain computationally tractable. 
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Abstract. Many security properties of cryptographic protocols can be 
all seen as specific instances of a general property, we called Non De- 
ducibility on Composition (NDC), that we proposed a few years ago 
for studying information flow properties in computer systems. The ad- 
vantage of our unifying theory is that formal comparison among these 
properties is now easier and that the full generality of NDC has helped 
us in finding a few new attacks on cryptographic protocols. 



1 Introduction 

Many security properties of cryptographic protocols have been identified in re- 
cent years, such as secrecy (confidential information should be available only to 
the partners of the communication), authentication (capability of identifying the 
other partner engaged in a communication), integrity (assurance of no alteration 
of message content), non repudiation (assurance that a signed document can- 
not be repudiated by the signer), fairness (in a contract, no party can obtain 
advantage by ending the protocol first), and some others. 

Even if there is a widespread agreement on what is the intended meaning of 
these properties, under a closer scrutiny one realizes that they are very slippery 
properties, especially authentication. As a matter of fact, formal definitions, e.g. 
of authentication, have rarely been given, not widely agreed upon, usually not 
compared and only recently proposed in the literature (see, e.g., mmm)- 
This is sometimes due to the fact that we first need a formal model on which the 
problem is defined (and this is often a source of possible proliferation of different 
proposals) and then a formal definition w.r.t. the chosen model. Moreover, even 
when a formal definition is given, usually this is not (easily) comparable to 
others, due to different mathematical assumptions of the model. 

* Work partially supported by MURST Progetto TOSCA and Progetto “Certificazione 
automatica di programmi mediante interpretazione astratta” ; CNR Progetto “Mod- 
elli e Metodi per la Matematica e I’lngegneria” ; CSP Progetto “ISA: Isp Secured 
tr Ansactions” . 
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Fig. 1. Multilevel security: a high subject Si cannot write a low object O 2 and a low 
subject S 2 cannot read a high object Oi. 



Our claim is that a classic approach to security, used to study information 
flow in multilevel ^ computer systems, can be profitably used also for the anal- 
ysis of security properties in network protocols. 



1.1 Multilevel Security and non Interference 

In a multilevel systems, processes/users and objects are bound to a speciflc 
security level (e.g., in the military jargon, unclassifled, classifled, secret and top 
secret) and information can only flow from low levels to higher ones. This is 
usually implemented by constraining the possible actions of processes according 
to the rules of no read-up and no write-down (see Fig. 

The advantage of this approach w.r.t. conventional approaches used in com- 
mercially available operating systems (e.g., Unix) is that the possible information 
disclosures caused by the inadvertent execution of a Trojan Horse program is 
confined inside the level of the user that executed it. However, these two rules 
are not enough as indirect information flows, usually called covert channels, may 
be possible when using some shared resource. For instance, it is not difficult to 
build a Trojan Horse program that, once executed by a high level user, is able to 
downgrade information by synchronizing with a low level process on the system 
side-effects generated by repeatedly Ailing the shared hard disk: at a predefined 
initial time, the high process can trasmit a bit 0 by causing a disk-full error on 
the low level attempt to write, or a bit 1 by allowing the low process to write, 
hence one bit each two clock cycles. 

To solve the problem of preventing unauthorized information flows, be they 
direct or indirect, in the last two decades many proposals have been presented, 
starting from the seminal idea of non interference proposed in uni for determin- 
istic state machines. In recent work mmn, two of the authors have studied the 
many non interference-like definitions in the literature, by defining all of them 
uniformly in a common process algebraic setting, producing the first taxonomy 
of these security properties reported in the literature. 
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In |l()lll| . we use a CCS-like process algebra called Security Process 
Algebra (SPA for short), where the set of actions is partitioned into two sets 
L and H of low actions and high ones, respectively. Processes built by using 
only actions in H (L) are by construction high (low) level processes. These pro- 
cesses are secure because their activities (expressed by the actions they perform) 
are confined inside the high (low) level. More interesting is the case of mixed 
(with actions from both L and H) processes, i.e., of those processes that even if 
belonging to the high (low) level may perform interactions with low (high) ob- 
jects, because for them we want to know if they allow information to flow in the 
wrong direction. Among the many non interference-like properties, we advocate 
one special property, called Non Deducibility on Composition (NDC for short), 
that can be expressed as follows: 

E e NDC iff yn g£h ■■ {E || n)\H « E\H 

where Eh is the set of all high level processes, r:: is a behavioural equivalence 
relation^ || is the CCS parallel composition and \ is the CCS restriction operator. 
Hence, on the one hand, E\H is able to exhibit only the low level behaviour of 
E, while {E || II)\H is the low level behaviour of E || il. The basic intuition is 
that the requirement of 

No information flow from high to low 
is expressed by 

No high level process can change the low behaviour. 



1.2 Non Interference for Security Protocols 

NDC essentially says that, given two groups of users H and L, there is no 
information flow from H to L iff there is no way for H to modify the behaviour 
of L. Analogously, we may think that L is the set of the honest participants to 
a protocol and H is the external, possibly malicious, environment, i.e. the set of 
possible intruders (or enemies) . Following the analogy, no information flow from 
high to low means that the intruders have no way to change the low behaviour 
of the protocol. 

To set up this correspondence more precisely, we have to single out the high 
level actions and the low level ones in this setting. We should assume that an 
intruder may have complete control of the network, and so it is reasonable to 
assume that the public channels (i.e., the names used for message exchange) are 
the high level actions. On the other hand, as a protocol specification is usually 
completely given by message exchanges, it is not clear what are the low level 
actions. In our approach, the low level actions are extra observable actions that 
are included into the protocol specification to observe properties of the protocol. 
Of course, the choice of these extra actions is property dependent. For instance, 

^ Actually, NDC in is this property when « is trace equivalence; other similar 

properties have been proposed by changing the relation, e.g., BNDC is as above 
where « is weak bisimulation. 
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we will see that to model some form of authentication as in m, it is enough to 
include special start /commit actions for all the honest participants. 

Furthermore, enemies should not be allowed to know secret information in 
advance: as we assume perfect cryptography (a crypted information can be known 
by an enemy only if he knows the decryption key), the initial knowledge of 
an ebemy must be limited to include only publicly available information, such 
as names of entities and public keys, and its own private data (e.g., enemy’s 

the set E'q of all the possible high 



IfiMEIlSl 



private key). Hence, by following 
level processes is as follows: = {X \ sort{X) C C and ID{X) C 

where C is the set of public channel names, ID{X) is the set of messages that 
syntactically appear in X, (// is the initial knowledge given to any enemy X, and 
T> is a deduction system that manipulates (blocks of) messages in the obvious 
way (e.g., a crypted information can be disclosed if the decryption key is known). 
By requiring that all the messages in ID{X) are deducible from (fi we are stating 
that the enemy cannot know in advance messages that are not explicitly given. 
The NDC property for a protocol P can hence be reformulated as: 



P e NDC iS yx G : {P\\X)\C Ki P\c 

On the one hand, P\C represents the secure specification of the protocol P 
running in isolation on perfectly secure channels. The visible behaviour of P is 
given by the property dependent, extra observables included in the specification. 
Hence, the behaviour of P \ C should describe the security property of interest. 
On the other hand, if P \ C is equivalent to (P || X) \ C, then this clearly means 
that X is not able to modify in any way the observable execution of P, i.e., the 
security property hold. 

The actual scheme we use, called GNDC, is a bit more general, where ~ 
is any pre-order and P \ C is replaced by a function on P, a(P), expressing 
the property as a set of processes having a special format that more directly 
and intutively recall the property of interest. Nonetheless, it is possible to show 
that NDC is the strongest property we can reasonably define over cryptographic 
protocols (for more details, see jIS|). 

Interestingly enough, when the observational equivalence ~ is trace equiva- 
lence (two systems are equivalent if they perform the same set of traces), then 
NDC can be characterized in a simpler way, by finding a canonical, most gen- 
eral enemy that can be used in place of all (see [IS| ) . By removing the universal 
quantification, NDC can be verified by one single, albeit huge, check. The most 
general intruder is an intruder that can eavesdrop/intercept any message (adding 
the intercepted information to its knowledge set), as well as produce new mes- 
sages with pieces of information he knows. 



1.3 Plan of the Paper 

What do we want to show with this paper? Our primary goal is to substantiate 
our claim that most (maybe all) security properties proposed for the analysis 
of cryptographic protocols are expressible as suitable instances of the CNDC 
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schema above, by suitably choosing the property dependent, extra observable 
actions as well as a suitable behavioural equivalence. Some work in this direction 
has been reported in mm- We think that the advantages of our approach 
include at least the following: 

— one check for all: As all the properties are defined in the same NDC style, it 
is possible to put in the specification the extra actions for all the properties 
of interest, hence obtaining that one check for this rich case implies that all 
the properties are satisfied. 

— formal comparison: as the definitions are now given in a uniform style, it 
should be easier to compare the relative merits; this is especially true for 
slippery properties such as the many varieties of authentication (e.g., see 

for some preliminary results in this direction). 

— accuracy: So far we have analyzed about 40 protocols (with the help of an 
automatics tool |HE]) of a well-known library of crypto-protocols jOj- Two 
supposedly correct protocols have been shown incorrect and for a few addi- 
tional flawed protocols some new attacks have been found. Our experience 
hence supports our claim that a protocol passing the NDC test is more 
likely to be flaw free. 

The paper is organized as follows. In Section 2 we gently introduce the reader 
to the realm of security properties; by means of some simple examples, five secu- 
rity properties are informally described. Section 3 shows that all these properties 
are actually instances of the general scheme GNDC and that NDC (when all 
the suitable extra actions have been inserted) implies them all. Section 4 shows 
one larger example, the Woo & Lam mutual authentication protocol as reported 
in Schneier’s textbook This version of the protocol is fiawed. Finally, Section 
5 reports some final remarks and future work. 



2 Security Properties 

In this section we present some typical security properties through some simple 
examples. All these properties have been defined for different aims and have been 
formalized using various models. Indeed, the examples will allow us to identify 
a general common idea behind all of these properties. 



2.1 A Simple Key-Exchange Protocol 

The first example we consider is a simple key-exchange protocol (see ^H) with 
public key cryptography. There is a process KDC (Key Distribution Center) 
on a remote host which is devoted to the distribution of the public keys. In 
particular, when Alice (A) need to send some secret information to Bob (B), A 
asks KDC for B’s public key. Then, A can encode every secret message with this 
key and is guaranteed that only B will be able to read it. In particular, in the 
protocol we are going to analyze, A sends to B a session key encoded with the 
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public key of B. Such session key will be used for every further communication, 
until the two users decide to establish a new session with a new session key. 

The aim of the protocol is simply to distribute to B the session key generated 
by A. Since A uses the public key of B, she is assured that only B will know 
the session key and so there is a form of implicit one-side authentication of 
B for A. The opposite authentication is not valid since B has no guarantees 
about who he is talking to. Anyone can encrypt a session key with BA public 
key and send it pretending to be A. So, B will presumably check the identity 
of A during the session (if this is required). We could imagine a situation of a 
remote login on a machine i? by a user A. First the user establishes a session key 
with the remote login server using the protocol above. Then the communication 
proceeds encrypted with the session key and the server checks the identity of A 
using, for instance, some login/password mechanism. However A is assured to 
communicate with B since only B can have received the session key. 

In order to formalize the protocol we use the notation A ^ B : msg repre- 
senting the sending of message msg from A to B. A possible definition of the 
protocol could be the following sequence of four message exchanges: 



Message 1 


A KDC 


A,B 


Message 2 


KDC A : 


PKb 


Message 3 


A^ B : 


{^sess 


Message 4 


B ^ A : 


{M}k^ 



s}pKb 



where Message 1 is the request, sent from A to KDC, of B’s public key; Message 
2 is the reply from KDC to A, containing the public key PKb of B] Message 3 
contains the session key Kgess encoded using the public key PK b and is sent to 
B; finally, in Message 4, B uses the session key to send a message M to A. 

Different security properties can be now considered: (i) message authenticity, 
e.g., message M should be authentic from B (or as it would have been sent from 
B) since only A and B should know the session key at the end of the protocol; 
{ii) entity authentication, e.g., if A receives the last message encrypted with the 
correct key, then A should be guaranteed that B has run the protocol with her 
(or at least is “alive”); {Hi) secrecy, e.g., at the end of the protocol, the session 
key and the message M should be known only to A and B. In the following, we 
consider all of these properties, using the protocol above as a running example. 



2.2 Message Authenticity 



Suppose that we can fix the message M that B is willing to send to A. This 
means that M does not represent a generic message, but it is a particular one. 
If we can do this, we can check message authenticity by just considering all 
the possible runs of the protocol and by requiring that, in such runs, A always 
receives the correct message M, i.e. the message B wanted to send to A. If this is 
true, we can conclude that the protocol is indeed guaranteeing that no one is able 
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to force A accepting a faked message M'. This notion of message authenticity is 
due to Abadi and Gordon 

A first important thing to observe is that considering all the possible runs is 
not enough. At least we need to be more precise about what we mean by run. As 
a matter of fact, we have to consider all the possible executions of the protocol 
in every possible (potentially hostile) environment. It is certainly different if we 
consider the protocol execution with or without the presence of some malicious 
enemy which tries to send a faked M' . (As mentioned in the Introduction, we 
have to consider an initial knowledge (pi but, for sake of readibility, we often 
omit it.) We can thus rephrase the message authenticity property as follows: 

“Whatever hostile environment is considered, A will never receive (as 
part of Message 4) during all her possible runs a message different from 
M”. 



A second important issue is now also evident. We are requiring that a particular 
piece of information sent inside a particular message differs from a certain fixed 
message M. Indeed, this property looks really ad-hoc for the key exchanged 
protocol considered here. This is quite typical when trying to define precisely 
security properties, since they often depend on the structure of the analyzed 
protocol/system. We can make more intuitive the specification by using an event 
receivedijn) corresponding to the fact that A is receiving message m. If P{m) is 
the protocol where Bob is willing to send message m to Alice we can just state 
that 



“P{M) guarantees message authenticity if whatever hostile environment 
is considered, an event received{M') with M' ^ M can never occur” . 

We will further generalize this idea in the following. Before that we show that the 
protocol we have considered until now does not guarantee message authenticity. 
The weakness is indeed in the second message. An enemy can easily intercept 
the public key of B and substitute it with its own key. This allows the enemy to 
learn the session key and to send a faked message M' as followsH 



Message 1 


A KDC : 


A,B 




Message 2 


KDC^~E{A) : 


PKb 


E intercepts this 


Message 2' E{KDC) A : 


PKe 




Message 3 


A^E(S) : 


{ATsessjpKE 




Message 4 


E(S) ^ A : 




event received(M') 



^ Indeed, in |2) a universal quantification over all the possible messages to be sent is 
required. For the sake of simplicity we do not consider it here. 

® We denote with E{U) the enemy which is impersonating the entity U. So E{U) — >■ 
A : M means that E sends M to A by simulating U, while A — >■ E{U) : M that 
E intercepts the message M from A to U (so U receives nothing). However we 
must point out that this message sequences notation should be used only for the 
intuitive description of the protocols (and attacks) and not for their formal analysis 
as remarked by many authors (e.g., see Q). 
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Thus, message authenticity does not hold since, in this particular execution, an 
event received{M') occurs and M' can be whatever message (different from M) 
the enemy is willing to send to A. 



2.3 Entity Authentication 

We now consider another security property: entity authentication. This property 
is more subtle. Informally, entity authentication should allow the verification 
of an entity’s claimed identity, by another entity. There are several attempts 
in the literature to formalize this notion. Here, we follow the ones based on 
correspondence between actions of the participants (e.g., see usm). 

As an example, in our protocol we would like that whenever A receives the 
last message then B has indeed executed the protocol. Consider two events 
commit{A,B) and run{B,A) representing the fact that A has successfully ter- 
minated the protocol apparently with B and B has at least started the protocol 
(i.e., he has received a session key) with A. It is now sufficient to require that 
event commit{A, B) is always preceded by event run{B,A) m- In other words 
commit{A, B) should not happen if B has not started the protocol. Similarly to 
the previous property we can require that: 

“P guarantees entity authentication of B with respect to A if whatever 
hostile environment is considered, it can never occur an event ccrmmit{A, B) 
when run{B,A) has not occurred previously”. 

Note that the same attack considered for message authenticity is also an attack 
for entity authentication: 



Message 1 


A KDC 


A,B 




Message 2 


KDC E(A) : 


PKb 


E intercepts this 


Message 2' 


~E{KDC) A : 


PKe 




Message 3 


E(P) : 


{AsessIpKe 




Message 4 


E(P) ^ A : 




event commit{A,B) 



Since B is doing nothing, no event run{B, A) can happen and the entity authen- 
tication property does not hold. As a matter of fact, in the attack sequence the 
enemy is indeed able to mask as B. This means that A cannot be sure about 
the identity of the other party, i.e., no entity authentication is guaranteed. 

Note that entity authentication and message authenticity are indeed different 
properties. As an example, if we do not have a message “to be sent” by the entity 
that we want to authenticate, message authenticity property becomes useless. 
Consider the following (faulty) authentication protocol: 

Message 1 A ^ B : {Na}kab 
Message 2 B ^ A : Na 

In order to verify the identity of B, A sends a challenge Na (typically a random 
number) to B encrypted with symmetric key Kab which is only known by A 
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and B. Only B will be able to decrypt and send it back to A. Note that here 
B has no private messages to send to A. As a consequence, the following entity 
authentication attack does not represent a message authenticity attack: 



Message 1 A — >• 
Message 1' E(_B) — >• 
Message 2' A — >• 
Message 2 E(_B) — >• 



E(P) : 


{Na}kab 


A : 


{Na}kab 


E(P) : 


Na 


A : 


Na 



events commit{A,B), 
received{N a) 



This is a typical parallel session attack: the enemy intercepts the first message 
and starts a new session of the protocol with A. Basically, the enemy uses this 
second session to obtain from A the value Na- Finally the enemy can con- 
clude the first session successfully masking as B. Note that, again, we have a 
commit{A, B) event with no run{B, A). Note also that we have tried to detect a 
possible message authenticity attack by observing the received{N a) event. How- 
ever Na is exactly the expected message and, moreover, no other message would 
be accepted by A. In other words, we have no message here to authenticate and 
entity authentication should be guaranteed (indeed it is not) by the possibility 
for B of decrypting a message. 



2.4 Secrecy 

Let us now consider the third property: secrecy. This is quite intuitive and re- 
quires that messages declared to be secret should not be learnt by unauthorized 
users. We can consider a new event learnt{M) that represents the fact that a 
certain (secret) message M has been learnt by the external environment (i.e., 
by the enemy). So, this new (low) event is performed by the enemy and not by 
the honest participants, as for the previously analysed properties. In our first 
example of attack we had such an event for K seas'- 



Message 1 A 

Message 2 KDC 

Message 2' ¥.{KDC) 
Message 3 A 

Message 4 E(S) 



-A 


KDC : 


A,B 


-A 


E(A) : 


PKb 


-A 


A : 


PKe 


-A 


E(P) : 


{ATsessjpK 


-A 


A : 





E intercepts this 
event learnt{'Ksess) 



Indeed, secret key Ksesa is learnt by the enemy when it is sent encrypted with 
enemy’s public key. We can thus formulate secrecy in our usual style as follows: 

“P guarantees secrecy of m if whatever hostile environment is considered, 
the event learnt{m) can never occur” . 



2.5 The Example (Partially) Repaired 

In this section we show how to repair the initial protocol in order to avoid the 
attacks reported above (even though other ones are possible). The attacks work 
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since it is possible to fake Message 2 in order to provide a wrong public key, 
i.e., not the one associated to B. In actual implementations, the KDC sends the 
public key together with the name of the associated user, all signed with its own 
key. Hence, the resulting protocol is the following: 

Message 1 A — >■ KDC : A,B 
Message 2 KDC ^ A : {PKb^B}skkdc 

Message 3 A^ B : {Ksess}pKB 

Message 4 B ^ A : 

where Message 2 now consists of the pair (PKg, B) encrypted with the private 
key of KDC (SKkdc), in such a way that everyone can decrypt it and be sure 
that the associated public key PKb has been originated by KDC. It is easy to 
see that the attacks previously shown are not possible anymore, since the enemy 
is not able to generate the block {PKe,B}skkdc- 

This updated protocol may still be subject to some form of attacks, namely 
replay attacks. These attacks work since the enemy is able to re-use the infor- 
mation obtained in previous runs of the protocol in order to fake messages for 
A. Imagine the situation where the session key Kgess between A and B has been 
safely established and is used to send two different messages, say M, M' , to A, 
by simply adding another step to the previous protocol: 

Message 5 B ^ A : 

The enemy could eavesdrop Message 4, intercept Message 5 and replay the Mes- 
sage 4 to A. Thus, A receives two times message Ml For example, if M represents 
a bank transfer request this attack could result in a double transfer of money. 
This is a message authenticity attack and is revealed since we obtain two events 
received(M) instead of received{M), received(M'). 

This kind of attacks may be prevented by inserting freshness in the messages 
(such as newly generated random numbers). It is not our intention to give here 
a complete spectrum of attacks on cryptographic protocols; we just show that 
these are very subtle and make the design of such protocols very challenging. 
(For some guidelines about cryptographic protocols design see 0.) 

2.6 Non Repudiation and Fairness 

There are other interesting properties that can be rephrased in our common style. 
For example, non repudiation and fairness (i.e., fair message exchange). The 
former is related to the possibility of considering a certain message as a signed 
contract, which is thus not repudiable by the sender. This is typically obtained 
through protocols which are based on some electronic signature mechanism. 
Non repudiation is of course very important for electronic commerce. The latter 
property requires that mutual information exchanges are performed in a fair 
way, i.e., no one of the parties involved in the exchange should get an advantage 
by obtaining the information before the other party is also able to obtain it. 
Indeed, if an exchange is not fair, the advantaged party could refuse to give 
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its information once it has received the information from the other party. This 
property is sometimes required together with non repudiation when the parties 
want to simultaneously exchange two non repudiable messages (for example, 
one message could be the electronic payment and the other one could be the 
corresponding electronic receipt). Here we follow the treatment in 

(Fair) Non repudiation protocols are often quite complex. Here, in order to 
illustrate these properties we consider the following very simple example (see 
also PH): 

Message 1 A ^ B : M, {M, B}ska 
Message 2 B ^ A : {M, A}skb 

A sends a message M to B, signed with her secret key SK^. This is a guarantee 
that only A could have produced such an encryption. This means that A cannot 
deny to have sent such a message to B (note that B is also contained into the 
signature). Then B sends back to A message M signed with its own secret key, 
in order to confirm the reception of M. After that, also B will not be able to 
deny that he has received M from A. We can imagine that M represents some 
form of electronic payment: A wants a proof that B has received from her the 
money (a receipt) and B wants a proof that A had indeed sent the payment M to 
him (for example, in case M is not a valid payment). These are non-repudiation 
properties and are guaranteed by the signatures (assuming that neither A nor B 
publicizes her/his secret key). For example, if A collects a certain evidence that 
B has sent a message (in our protocol the signature {M, A}skb represents such 
an evidence), then B should have indeed sent such a message. This is somehow 
similar to the entity authentication property we have discussed above. The main 
difference is that B can be malicious and may try to send a faked evidence. 

Consider now the fairness property. Indeed, in the simple protocol we have 
presented, B has an evident advantage over A. He can just refuse to send the last 
message and A will never be able to prove that she has indeed sent the money to 
B. The exchange is clearly not fair. We can try to define non-repudiation with 
fairness as follows: 

“P guarantees non-repudiation with fairness to A on a message M if, 
whatever malicious B is considered, if B gets evidence that A has orig- 
inated M than also A will eventually obtain the evidence that B has 
received M” . 

We observe two important points. First of all, in this definition one of the parties 
{B) takes the role of the hostile environment. This is intuitively correct, since 
we want to see if B is able to cheat A by keeping the payment without releasing 
the receipt. The second point is the most important: this property cannot be 
expressed as a safety property (i.e., nothing bad happens). As a matter of fact 
we are requiring that something good should happen if B gets his evidence, i.e., 
that also A should soon or later get her evidence. This leads us to relax our 
general scheme as follows: 
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“P guarantees non-repudiation with fairness to A on a message M if, 
whatever hostile environment is considered (with a certain initial knowl- 
edge then P satisfies the specification a(P)”, 

where B is the enemy is then _B’s initial knowledge), the relation satisfies 
is a deadlock-sensitive process preorder (e.g., the testing preorder jZ|) and a{P) 
is the process where every time B gets his evidence then also A gets her own 
evidence. The choice of a deadlock-sensitive preorder is justified by the fact that 
we want to check that no deadlock is possible after B evidence and this is what 
we want to require also in protocol P. In the example above, if B decides not to 
send the second message we clearly obtain such a deadlock and the property is 
not satisfied. 

3 A General Scheme for Security Properties 

We have seen that several different properties (message authenticity, entity au- 
thentication, secrecy) can be (informally) written in a similar style which sounds 
like: 

“P guarantees a security property S if, whatever hostile environment is 
considered, P never shows some particular bad behaviour’’’’ . 

In general, this set of bad behaviours depends on the particular property and 
sometimes may also depend on the protocol P. For example, for message authen- 
ticity we need the parameter m of P in order to define what is a bad behaviour. 
It is sometimes easier to choose a complementary approach and describe which 
are the good behaviors (they just correspond to all the behaviour that are not 
bad). So, if we denote by as{P) the set of all possible good behaviour of P 
with respect to the security property S, then our general scheme becomes the 
following: 

“P guarantees a security property S if, whatever hostile environment is 
considered, P always shows behaviours in as{P)” ■ 

Indeed, cryptographic protocols typically rely on some secret values (keys or 
random numbers used for challenge-response). Moreover, if we want to analyze 
secrecy we certainly have to face the presence of some initially secret message. 
We can thus slightly refine our scheme as follows: 

“P guarantees a security property S if, whatever hostile environment 
is considered with a certain initial knowledge fii, then P always shows 
behaviours in as{P)” ■ 

Moreover, as discussed for fair non repudiation, it may be useful also to pa- 
rameterize the previous notion w.r.t. the notion of behaviour, by considering 
satisfaction relations among processes, hence obtaining: 

“P guarantees a security property S if, whatever hostile environment 
is considered with a certain initial knowledge </>/, then P satisfies the 
specification as(P)”, 
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The (informal) considerations above are at the base of the proposal, originally 
reported in of a uniform formal framework where security properties can 
be defined. The proposed schema, called General NDC {GNDC for short), is 
as follows: 0 

P is GNDCZ iff VX e :{P\\X)\G^ a{P) 

where is a behavioural preorder and a is a function from processes to processes. 
Now, we can just define a specific property by suitably instantiating the function 
a(P) and the preorder Ri. We reconsider the properties presented so far, showing 
informally their corresponding a{P) functions and ~ relations: 

— Non-interference: aNi{P{M)) = P{M) \ G. We obtain the exact definition 
of NDG if we use trace equivalence as 

— Message authenticity: aMA{P{M)) is the process where received(M) is the 
only event received which may occur. For a formal characterisation of this 
property in the GNDG scheme for a large class of protocols, please see ESI, 
where the ~ relation is (a suitable) may testing or trace equivalence. 

— Entity authentication: aEA{P) is the process where commit{A,B) is always 
preceded by run{B,A). Please refer to fH] for other authentication prop- 
erties based on the correspondence idea, such as the ones in the hierarchy 
of [2H or message authentication as proposed in 12^. The relation Ri is in 
general trace inclusion. 

— Secrecy of m: aseciPirn)) is the set of processes where the event learntfm) 
can never occur (for more details, see [14^1. The relation k. is in general trace 
inclusion. 

— Non repudiation: anr{P{M)) is the process where whenever an evidence of 
a message M is obtained then that message has been effectively sent (see 
m for a deeper discussion). The relation Ri is in general trace inclusion. 

— fairness: afair{P{M)) is the process such that if the event B_ev-A_or_M 
(signaling that B has evidence that A originated M) then the event 
A_ev -B jrec-M (signaling that A has evidence that B received M) will even- 
tually happen (see for a deeper discussion) . The relation k, is in general 
a failure or testing preorder (see EEl)- 



3.1 Security Attacks as Interferences 

In the previous section we have shown how several security properties can be 
seen as instances of the following generall scheme: 

“P guarantees a security property S if, whatever hostile environment 
is considered with a certain initial knowledge (f>i, then P satisfies the 
specification as(P)”, 

where the relation satisfies and the function as{P) are property dependent 
parameters. 

* Indeed GNDC depends on the set (f)j , but we will omit it for the sake of readability. 
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It would be useful to find the most restrictive as{P) as it would induce 
the strongest property, up to the chosen notion of behaviour (i.e., the chosen 
relation satisfies). The idea is to use an as{P) which returns an encapsulation of 
protocol P, i.e., a version of P which is completely isolated from the environment. 
Intuitively, this secure encapsulation of P should correspond to the execution of 
P in a perfectly secure network where only the honest parties are present. In our 
process algebra setting, this corresponds to the restriction of all public channels 
where protocol messages are sent. This makes it impossible for an intruder to 
interfere on the protocol execution. Let us briefly explain this important point. 
Consider a preorder between processes <. Next, consider the induced equivalence 
~ as < n <“^. Also suppose that for every process P we have 

(P||0)\C«P\C 

where 0 is the process that does nothing. This means that the process restricted 
on C is equivalent to the protocol in composition with the intruder that does 
nothing. The previous property holds for every security property we have studied. 
Please also note that, by definition, 0 S £q^ for every (pj. So it is very natural 
to consider a functions and processes P such that: 

P\C<a{P) 

This simply means that the protocol P is correct (as it satisfies its specification 
a(P)) at least when it is not under attack. This condition can be somehow seen 
as a reasonable criterion for any good protocol: it must be correct at least when 
it is not under attack! Under this observation, it is clear that P S NDC implies 
P e GNDC^. 

If NDC holds then we can say that the hostile environment has no effect at 
all on P, since it still behaves as if it were isolated. It is important to note that 
what we observe of P behaviour are exactly the events we have discussed in the 
previous section. It is easy to see that, if we specify all the events corresponding 
to a certain set of properties, then NDC will imply all of them. 0 

When P does not guarantee NDC we say that an interference is possible, 
i.e., a behaviour that is caused by the hostile environment. It comes out that 
an attack to a security properties is revealed by NDC as an interference of the 
enemy on the protocol. This allows us to use NDC in order to detect different 
attacks all at once. In the next section we illustrate this issue with an example. 

4 An Example 

In this section we consider a larger protocol and we show how the analysis of 
(some of) the security properties presented above, can be carried out in a uniform 
way. The protocol is the Woo-Lam public key one, which has been proposed for 

® Note that, the notion of satisfies must be chosen as the stronger one used by the 
security properties considered. 
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mutual entity authentication and key-exchange. The protocol, as reported in 
m, is the following sequence of 7 messages: 



Message 1 
Message 2 
Message 3 
Message 4 
Message 5 
Message 6 
Message 7 



KDC 
KDC ^ A 
B 

B -)■ KDC 
KDC ^ B 
B -)■ A 
A^ B 



A,B 

SKkdg 

{A, Na}pkb 
A,B,{Na}pKkdc 

{PKa}sk KDC 5 {{A^a, K, a, B}sKkdc}pKb 
{{Na,K,A, B}sKkdc^^b}pKa 
{Kb}k 



Alice sends to KDC a request of connection with Bob. Then KDC replies with 
a certified copy of B’s public key. This copy is indeed signed with KDC’s secret 
key, i.e., only KDC may have generated it. Alice checks the signature and sends 
to Bob a challenge Na encrypted with Bob’s public key. Bob forwards Na to 
KDC encrypting it with KDC’s public key and adding both his own identifier 
and the one of Alice. KDC is now ready to generate a certificate containing the 
challenge Na, the fresh session key K and the two identifiers A and B. KDC 
sends this to Bob (encrypted with Bob’s public key) together with a signed 
copy of Alice’s public key. Bob checks the signature and forwards to Alice the 
certificate received from KDC together with a challenge Np, all encrypted with 
Alice’s public key. Finally, Alice sends back to Bob the challenge Np encrypted 
with the new session key K. 

This protocol looks quite complex. As a matter of fact it mixes the requests 
for public keys with entity authentication and key-exchange. In particular the 
first two messages and the first part of message 5 are for public keys distribution. 
The challenges Na and Np are used to guarantee mutual entity authentication. 
Finally, the certificate provides (authenticated) key-distribution. Indeed, this 
protocol contains two errors with respect to the correct version that cause a 
number of attacks. We now analyze the protocol by applying the ideas developed 
in the previous section. First of all we point out in detail which are the various 
security properties that the protocol should guarantee: 



— public key PKa and PKp should be authentic from KDC (this is why they 
are signed); 

— it is also important that the session key K is authentic, i.e., no enemy should 
be able to force A and B using a faked session key; 

— of course, the session key K should also remain secret; 

— also the challenges should remain secret (since they are always sent en- 
crypted) but this is not crucial since they are used only for guaranteeing 
entity authentication; in protocols where the nonces are also used for gener- 
ating a new session key secrecy requirement becomes crucial; 

— finally, the protocol should guarantee mutual entity authentication between 
A and B. 



We now show all the events that are used to model such properties as suitable 
annotation to the protocol, where the secrecy event learnt{K) is not reported. 
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Table 


1. An 


attack to secrecy and message authenticity 


1 


A 


KDC 


: A,B 


run{A, B) 


2 


KDC -A 


A 


: {PKb}skkdc 


received{PKB 


3 


A 


B 


: {A,Na}pkb 


run(B, A) 


4 


B 


KDC 


: A,B,{Na}pkkdc 




5 


KDC -A 


E{B) 


• {PP-a^skxdc-! 

{{Aa, K, a, B}skkdc}pkb 




5' 


E(ADC) 


B 


: {PKe}skkdc> 

{{Aa, K, a, B}skkdc}pkb 


receined(PKE 


6 


B 


E(A) 


: {{Na,K,A,B}skkdc’^b}pk.e 


learnt (K) 


6' 


E(B) 


A 


: {{Aa, A, A, As}pka 


received(K) 
commit{A, B) 


7 


A 


B 


: {Nb}k 


commit{B , A) 



as it is performed by the omitted enemy. This allows us check all the properties 
in just one step following our NDC approach. 



1 


A -A KDC 


A,B 


run(A, B) 


2 


KDC -A A : 


{PKb}skkdc 


received(PKB) 


3 


A^ B 


{A,Na}pkb 


run{B, A) 


4 


B -A KDC : 


A,B,{Na}pKkdc 




5 


KDC -A B : 


{BKa^SKkdc^ 


received(PKA) 






{{fV.4, K, A, B}sKkdc}pKb 




6 


B ^ A : 


{{Na, K, a, B}skkdc’-^b}pj^a 


received(K) 








commit{A, B) 


7 


A^ B : 




commit{B , A) 



The security properties that this protocol should guarantee have in common that 
trace inclusion is their suitable ~ relation. Hence, an interference is a trace that 
is composed only by the security events and that is possible when the enemy 
is active, but not when the protocol runs in isolation. Consider now the attack 
sequence in Table where the interference trace is reported in the right column. 
In this attack E exploits (one of) the mistakes in the specification of the protocol 
as reported in m- In particular, it is necessary to include in the certificate of 
the public key also the identifier of the corresponding owner. Hence, in messages 
2 and 5 the certificate should be {PKb,B}skkdc {PKaiA}skkdc^ respec- 
tively. The enemy can thus substitute in message 5 the signed public key of A 
with its own signed key. Note that the enemy can obtain a signed copy of its 
own key by just running honestly the protocol with another user. After that, the 
sixth message will be encrypted by Bob with the public key of the enemy thus 
allowing the interception of the certificate and, consequently, of the session key 
K. By observing the events we can identify two attacks: 

1 . the protocol does not guarantee the authenticity of Alice public key in mes- 
sage 5; we observe this through event received{PKE)', as we stated above, 
this is caused by the absence of Alice identifier inside the signature; 
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Table 2. An 


attack to secrecy, authenticity and entity authentication 


E(A) ->■ 


KDC : 


A,B 




KDC 


E(A) : 


{PKb}skkdc 




E(A) ^ 


B 


{A,Na}pkb 


run{B, A) 


B 


KDC : 


A,B,{Na}pkkdc 




KDC 


E(R) : 


{BKa} skxdc 5 
{{Na,K, A, B}skkdc}pkb 




E{KDC) 


B : 


{PKe}skkdc 
{{Na,K, A, B}skkdc}pkb 


received(PKE 


B 


E(A) : 


{{Aa, K, A, B}skkdc’^b}pke 


learnt (K) 


E(A)^ 


B : 


{Nb}k 


commit{B , A) 



2. the protocol does not guarantee the secrecy of K] this is revealed by event 
learnt(K) . 

Note also that we do not observe any entity authentication attack. Alice and Bob 
are convinced to communicate one another at the end of the protocol. Thus, the 
secrecy attack over K becomes even more dangerous, as the enemy can easily 
eavesdrop every future communication encrypted with K between Alice and Bob 
and they have no way of discovering this. 

Indeed it is easy to show an attack similar to the previous one in order to 
obtain also an entity authentication failure, e.g. see Table 0 

The idea is that the role plaied by A in the previous attack could be fully 
simulated by the enemy as done here. Since we have commit{B,A) with no 
run(A, B) the entity authentication attack is indeed revealed. Hence, the proto- 
col does not guarantee the authentication of A with respect to B, i.e., the enemy 
is indeed able to impersonate A. Note that the attack on message authenticity 
and secrecy are still valid. 

Other complex attacks are possible, involving two or three parallel sessions 
of the protocol. All of them are based on the fact that the certificate does not 
include the corresponding identifier. As we already stated, this is an error in the 
version reported in In the original protocol m the certificates are correct. 
However in such a version there is another error, causing different failures, that 
has been corrected two months later by the same authors in m- 

5 Conclusion 

Our non interference-based approach to the analysis of protocols has been mech- 
anized 12101 , resulting in a tool that checks NDC on finite state representations 
of the honest participants and the (most general) enemy. With the help of this 
tool, we have been able to show failures upon two unflawed (to the best of our 
knowledge) protocols: Woo & Lam public key one-way authentication protocol 
and ISO public key two-pass parallel authentication protocol; and new failures 
upon three flawed protocols: Encrypted Key Exchange, Station to Station, Woo 
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& Lam symmetric key one-way authentication protocol (the last one reported in 
Many other protocols have been analyzed, most of those reported in the 
cryptographic protocol library 0, and we have been able to capture the attacks 
reported there. We are now improving the efficiency of our tool in order to be 
able to analyze larger, commercial protocols for e-commerce, such as SET m- 

Future extensions of the approach include the modeling of cryptographic 
protocols with more concrete information, e.g., time and probability, that can 
be helpful in order to discover time/probability dependent attacks that cannot be 
revealed in a purely nondeterministic setting. Some initial work in this direction 
is H2|, where the NDC idea has been extended in the context of a discrete time 
process algebra, and applied to prevent timing covert channels in multilevel 
computer systems. 

Our NDC-based approach has been developed for a CCS-like calculus that 
is powerful enough to model most cryptographic protocols. However, recursive 
protocols as well as protocols for mobile systems (where channel names are 
passed as values in a communication) cannot be easily modeled. Hence, we are 
planning to study the extension of our approach to richer calculi, such as the 
spi-calculus 0. 
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Abstract. We obtain new results regarding the precise average bit- 
complexity of five algorithms of a broad Euclidean type. We develop 
a general framework for analysis of algorithms, where the average-case 
complexity of an algorithm is seen to be related to the analytic behaviour 
in the complex plane of the set of elementary transformations determined 
by the algorithms. The methods rely on properties of transfer operators 
suitably adapted from dynamical systems theory and provide a unifying 
framework for the analysis of an entire class of gcd-like algorithms. 



1 Introduction 

Motivations. Euclid’s algorithm was analysed first in the worst case in 1733 
by de Lagny, then in the average-case around 1969 independently by Heilbronn 
| I1 2I | and Dixon and finally in distribution by Hensley m who proved in 
1994 that the Euclidean algorithm has Gaussian behaviour. The first methods 
used range from combinatorial (de Lagny, Heilbronn) to probabilistic (Dixon). 
In parallel, studies by Levy, Khinchin, Kuzmin and Wirsing had established the 
metric theory of continued fractions by means of a specific density transformer. 
The more recent works rely for a good deal on transfer operators, a far-reaching 
generalization of density transformers, originally introduced by Ruelle mm in 
connection with the thermodynamic formalism and dynamical systems theory 
0. Examples are Mayer’s studies on the continued fraction transformation [T5I . 
Hensley’s work uni and several papers of Vallee 

All the previous analyses deal with the number of arithmetical operations per- 
formed during the execution of the algorithm. In this paper, we provide new 
analyses that characterize the precise average bit- complexity of a class of Eu- 
clidean algorithms. 

We consider here five algorithms that are all classical variations of the Euclidean 
algorithm and are called Classical (Q), By-Excess (£), Centered (1C), Subtrac- 
tive (T) and Binary (B). The complexity of these algorithms (in terms of the 
number of arithmetical operations to be performed) is now well-known: The 
two most common algorithms (Q) and (/C) have been analysed by Heilbronn 
C21, Dixon 0 and Rieger H3. The Subtractive algorithm (T) was studied by 
Yao and Knuth m, and Vardi |2S| analysed the By-Excess Algorithm (£) by 
comparing it to the Subtractive Algorithm. Brent [2| and Vallee m have anal- 
ysed the Binary algorithm (B). 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 37.3- R^ 2000. 
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Methods. Our approach is a refinement of methods that have been already 
used for instance in [41912312^ : it consists in viewing an algorithm of the broad 
gcd type as a dynamical system, where each iterative step is a linear fractional 
transformation (LFT) of the form z — >■ (az+b) / (cz+d). A specific set of transfor- 
mations is then associated with each algorithm. It already appears from previous 
treatments that the computational complexity of an algorithm is in fact dictated 
by the collective dynamics of its associated set of transformations. 

A previous work m describes a classification of gcd-like algorithms in terms 
of the average number of arithmetical operations: some of them are fast, that 
is, of logarithmic complexity 6>(logA^), while others are slow, that is, of the 
log-squared type 0(log^A^). It was established there that strong contraction 
properties of the elementary transformations that build up a gcd-like algorithm 
entail logarithmic cost, while the presence of an indifferent fixed-point leads to 
log-squared behaviour. 

It is not a priori clear whether the previous classification between fast algorithms 
and slow algorithms gives access to the average bit-complexity. The reason is 
that, even if fast algorithms perform fewer iterations, each iteration is often 
more complex than in the case of slow algorithms. In this paper, we prove that, 
in terms of the average-bit-complexity, fast algorithms are of log-squared type 
0(log^ A^) while slow ones are of log-cubed type 0(log^ A^). Our approach also 
precisely determines the constants that intervene in the expected costs. They 
are closely related to the main characteristics of the associated dynamical sys- 
tem (entropy, invariant measure, . . . ). These constants are computable numbers 
though they are not always related to classical constants of analysis. Our method 
can also open access (it will be shown in the full paper) to characteristics of the 
distribution of bit-complexity costs, including information on moments: the fast 
algorithms appear to have concentration of distribution — the cost converges in 
probability to its mean — while the slow ones exhibit an extremely large disper- 
sion of costs. 

Technically, this paper relies on a description of relevant parameters by means 
of generating functions, a common tool by now in the average-case analysis of 
algorithms m- As is usual in number theory contexts, the generating func- 
tions are Dirichlet series. They are first proved to be algebraically related to 
specific operators that encapsulate all the important informations relative to 
the “dynamics” of the algorithm. Their analytical properties depend on spectral 
properties of the operators, most notably the existence of a “spectral gap” that 
separates the dominant eigenvalue from the remainder of the spectrum. This 
determines the singularities of the Dirichlet series of costs. The asymptotic ex- 
traction of coefficients is then achieved by means of Tauberian theorems, one 
of the many ways to derive the prime number theorem. Average bit-complexity 
estimates finally result. The main thread of the paper is thus summarized by the 
chain: 

Euclidean algorithm Associated transformations Transfer operator 
Dirichlet series of costs Tauberian inversion Average-case complexity. 
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This chain then leads to effective and simple criteria for distinguishing slow 
algorithms from fast ones and for establishing concentration of distribution, etc. 



Results and plan of the paper. Section 3 is the central technical section of the 
paper. There, we develop the line of attack outlined earlier and introduce suc- 
cessively Dirichlet generating functions, transfer operators of the Ruelle type, 
and the basic elements of Tauberian theory that are adequate for our purposes. 
The main results of this section are summarized in Theorem 1 that describes the 
singularities of generating functions of bit-costs and implies a general criterion 
for log-squared versus log-cubed behaviour. 

In Section 4, we return to our five favorite algorithms. The corresponding anal- 
yses are summarized in Theorems 2 and 3 where we state our main results that 
fall as natural consequences of the present framework. It results from the analy- 
sis (Theorem 2) that the algorithms of the Fast Class — the Classical Algorithm 
(Q), the Centered Algorithm (/C), and the Binary algorithm (B ) — when ap- 
plied to random integers less than N, have average bit-complexity of the form 
Bn{H) ~ tI('H) log 2 IV, with H G {G,/C,B}. Each of the three constants 
A{G) , A{IC) , A{B) is a product of two constants: the first one is a constant a la 
Levy and is effectively characterized as the inverse of the entropy of the associ- 
ated dynamical system, while the second one is a constant a la Khinchin. The 
constants related to the two classical algorithms are explicit and easily obtained. 



A(G) 



61og^2 



[2- 



log 2 



log 






A{JC) 



61og(/)log2 log 2 1 (2*^ - + 2(j) 

TT^ 1 log</> log(/) — — 2 



The constant relative to the Binary Algorithm is expressed in terms of the in- 
variant measure rl) 2 {t)dt and its distribution function ^ 2 ( 1 ) (that are not explicit 
in this case) as 



A{B) 



2 log 2 
7t2 ^/>2(1) 



[ 1 + E 



m odd>l 



1 F2{ — ) 

2i(m) 



where £{m) denotes the binary length of integer m. Exact computations (for 
the first two algorithms) or various batches of a few thousands simulations on 
numbers of order of (for the Binary Algorithm) suggest numerical values 

for the three constants A{Q) ~ 1.24237 7l(/C) ~ 1.12655 ^(^) — 0.7. 

These values prove the efficiency of fast Euclidean Algorithms, compared to naive 
multiplication whose average bit-complexity is log 2 N on integers less than N. 
Theorem 3 proves that the algorithms of the Slow Class — the By-Excess Algo- 
rithm (£) and the Subtractive Algorithm (ff ) — have average bit-complexity of 
the log-cubed type, Bn{H) ~ A{H) log 2 N with H G {£,T}, and 



A{r) = 



21og^2 



A{£) = 



log' 2 
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2 Five Variations of the Euclidean Algorithm 

We present the five algorithms to be analysed; the first three use divisions, while 
the last two use only simpler operations, as subtractions and/or right shifts. 

2.1. Euclidean Algorithms with divisions. There are two divisions between u 
and V {v > u), that produce a positive remainder r such that 0 < r < u: the 
classical division (by-default) of the form v = mu + r, and the division by- 
excess, of the form v = mu — r. The centered division between u and v {v > u), 
of the form v = mu + er, with e = ±1 produces a positive remainder r such 
that 0 < r < u/2. There are three Euclidean algorithms associated with each 
type of division, respectively called the Classical Algorithm (Q), the By- Excess 
Algorithm (£), and the Centered Algorithm (/C). 

We denote by £(x) the number of bits of the positive integer x. Then, the bit- 
cost of a division step, of the form v = mu + er is equal to £{u) x £{m). It is 
followed by exchanges which involve numbers u and r, so that the total cost of 
a step is £{u) x £{m) + £{u) + £(r). In the case of the centered division, there 
is possibly a supplementary subtraction (in the case when e = — 1) in order to 
obtain a remainder in the interval [0 ,m/2]. 

2.2. Euclidean Algorithms without divisions. On the other hand, there are two 
algorithms where no divisions are performed, the Subtractive Algorithm (7~/ and 
the Binary Algorithm {B). 

The Subtractive Algorithm uses only subtractions, since it replaces the classical 
division v = mu -I- r by a sequence of m subtractions of the form v := v — u. The 
cost of a subtractive step v = u + {v — u) is equal to £{v). Then the bit-cost of 
a sequence of a sequence of subtractions equals £{v) x m. It is followed by an 
exchange, so that the total cost of a sequence of m subtractions £(v) x (m -I- 2) 
for the Subtractive algorithm. 

The Binary Algorithm uses only subtractions and right shifts, since it performs 
operations of the form v := (v — u)/2^, where b is the dyadic valuation of u — 
u, denoted by b := Val 2 {v — u), and defined as the largest exponent b such 
that 2** divides v — u. This algorithm has two nested loops and each external 
loop corresponds to an exchange. Between two exchanges, there is a sequence of 
(internal) iterations that constitutes one external step. 

Binary Euclidean Algorithm (u, v) 

While u V do 
While M < u do 

b := Val2 (u — m); v := {v — u) /2^-, 

Exchange u and v; 

Output: u (or v). 

Each internal step consists in subtractions and shifts and a sequence of internal 
steps can be written as 

V = u + 2^^vi, vi = u + 2^'^V2, 



vt-i = u + 2^^vi, ( 1 ) 
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Here V( is strictly less than it, and plays the role of a remainder r, so that the 
result of the sequence is a decomposition of the form v = mu + 2^r, with m 
odd, m < 2‘^ and r < u and constitutes an external step. The number of internal 
steps in equals b(rn), where b{x) denotes the number of ones in the binary 
expansion of x. 

The cost of a shift v := v/2^ is equal to £(v). Then the bit-cost of a sequence of 
internal steps whose result is a decomposition v = mu + dr equals i{v) x b{m). 
It is followed by an exchange, so that the total cost of an external step is £{v) x 
(6(771) -I- 2) for the Binary Algorithm. 
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Fig. 1. The five Euclidean algorithms. 



2.3. The sets of linear fractional transformations. When given an input (ui,Uq), 
(ill <1^0)5 each of the five algorithms performs fc (external) steps of the form 
uo = mim -I- diii2, 111 = m2U2 + ^2^3, . . . Uk-i = m^Uk + dkUk+i, 

and decomposes the rational x := (ui/uq) as (iii/uq) = /110/120. . .ohk{a), where 
the hiS are linear fractional transformations (LET) of the form hi = with 

b-[m,d\{x) = l/(m + dx) and a := {uk+i/uk) is the last value of the rational. 

The precise form of the possible LFT’s depends on the algorithm; there may 
exist a special set T of LFT’s in the final step: for instance, in the Classical 
Algorithm {Q), the last quotient m = 1 is forbidden. However, all the other 
steps use the same set of LFT’s, that we call the generic set. 

The value a equals 1 for the By-Excess Algorithm (£) and the Binary Algorithm 
{B), and equals 0 for the other three algorithms. The rational inputs of each 
algorithm belong to the basic interval X = [0,p] with p = 1 or p = 1/2: For the 
centered algorithm (/C), one has p = 1/2 and otherwise p = 1. For the first four 
algorithms, the valid inputs are all the rationale of X, while the valid inputs of 
the last algorithm are only the odd rationale of X. The variable “valid” has two 
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possible values {all, odd}, and finally, the type of the algorithm is defined as the 
triple (p, valid, o). 

In all the cases, when performing k (external) steps on the input (ui,Uq), the 
bit-cost C(ui, uq) of the algorithm is a sum of k terms, the j-th term representing 
the cost of the f-th (external) step and being a product of two factors; the first 
factor involves the binary length i{uj) of integer Uj (with j possibly equal to 
* — l,ior*-|-l according to the precise algorithm) , while the second one involves 
a cost relative to the i-th LFT to be performed, of the form c{hi). In the sequel, 
we can replace the length £(u) of integer u by its logarithm log 2 (u) in base 2 and 
always consider log 2 (ui) as the first factor to be studied. In contrast, we have to 
work with the exact cost due to the LFT. Finally, the bit-cost C(ui, ug) of the 
algorithm on the input (ui, ug) will be always of the form 



The table of Figure 1 describes the precise form of the divisions, the generic 
set Ti of associated LFT’s, the final set T and the cost c(h) of the LFT’s that 
intervene in 0. 

3 Generating Functions, Dynamical Operators 
and Tauberian Theorems 

Here, we describe the general tools for analysing bit-complexities of algorithms of 
the Euclidean type. We first introduce the Dirichlet generating functions relative 
to the bit-cost of the algorithms, so that the average bit-complexity involves 
partial sums of coefficients of these Dirichlet series. Tauberian Theorems are 
a classical tool that transfers analytical behaviour of Dirichlet series near their 
singularities into asymptotic behaviour of their coefficients. Then, by viewing the 
algorithm as a dynamical system, we relate generating functions of bit-cost to 
the Ruelle operator associated with the algorithm, so that we can easily describe 
the singularities of intervening generating functions. 

3.1. Generating functions. The following sets relative to the basic interval I, 
fi := {(rt, v)\ u,v valid, ujv € I}, Qn '■= {(«, v) € f2,v < N}, 
fl := {(it, v); u,v valid, gcd(u, v) = 1, u/v € I}, 17 at := {(it, v) € fl,v < N}, 

are the possible inputs of an algorithm. We denote by C(it, v) the bit-complexity 
of the algorithm on the input (it, v) as given in 0. We propose to study the av- 
CTage bit-complexity of an algorithm on and the average bit-complexity 
Hat of an algorithm on 17 At, then evaluate the asymptotic behaviour (for N ^ oo) 
of these costs. In fact, it is sufficient to study Bn and this will be shown in the 
full paper. The Dirichlet generating functions of costs. 



fc 




( 2 ) 




{u,v)^f2 



{u,v)^f2 



( 3 ) 
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are of the form 






n>l 



n>l 



where a„ is the number of pairs (w, v) of S7 with fixed v = n, and c„ is the 
cumulative cost on pairs (u, v) of f2 with fixed v = n, so that the average 
cost Bat to be studied is exactly the ratio between partial sums J2n<N of 
the coefficients of the Dirichlet series G(s) and partial sums J2n<N 
coefficients of the Dirichlet series F{s). 



3.2. Tauherian Theorems. The asymptotic evaluation of Bjq (for N — >■ oo) is 
made possible by the following Tauberian theorem ini, to be applied to the 
Dirichlet series F{s) and G{s). 

Tauberian Theorem. [Delange] Let F{s) be a Dirichlet series with non nega- 
tive coefficients such that F{s) converges for 5R(s) > cr > 0. Assume that 
(i) F{s) is analytic on 5J(s) = a,s ^ a, and 

{ii) for some 7 > 0, one has F{s) = A{s){s — C{s), where A, C are 

analytic at a, with A{a) yf 0. 

Then, as N ^ 00 , XI ^ [l+e(^) ]. 0. 

n<N ^ ( 7 + ) 

Theorem applies to the Dirichlet series F, G defined in (0 with cr = 2. For F{s), 
it applies with 7 = 0. For G{s), it applies with 7 = 2 or 7 = 3. For the slow 
algorithms, 7 equals 3, and the average bit-complexity will be of order log^ N. 
For the fast algorithms, 7 equals 2, and the average bit-complexity will be of 
order log^ N. 

First, the function F{s) is closely linked to the Riemann series of valid numbers 
C, defined as C(s) ;= valid equalities 



F{s) 



Cis-l) 

= a — 

as) 



with a = p if valid= all, and 




if valid= odd. 



Since C is itself easily related to the classical Zeta function, it is then clear that 
the Tauberian Theorem applies to F(s) with cr = 2 and 7 = 0. However, it is 
not clear how to apply directly theTauberian Theorem to G(s). In the following, 
we obtain expressions for G(s) which involve suitable Ruelle operators and from 
which the location and the nature of the singularities become apparent. 



3.3. Algebraic properties of Ruelle operators. The Ruelle operator relative 
to a LFT h depends on a complex parameter s and is defined as 



D[h]{x) 



f o h{x) 



( 4 ) 
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where D[h\ denotes the denominator of the linear fractional transformation 
(LFT) /i, defined for h{x) = (ax + b)l(cx + d) with a,b,c,d coprime integers 
by D[h](x) := \cx + d| = | det 

Given a cost function c defined on the LFT h, we introduce another Ruelle 
operator relative to h, 



Finally, with an algorithm that uses a set TL of LFT’s, one associates two Ruelle 
operators, 



H, := ^ R,,,, HW := ^ . (6) 

hen hen 

The multiplicative property of denominator D, i.e., 

D[hog](x) = D[h](g(x)) D[g](x) 

is translated by a multiplicative property on Ruelle operators: Given two LFT’s, 
h and g, the Ruelle operator Us, hog associated with the LFT h o g is exactly 
the operator Rg^g o Rg^/j. More generally, given two sets of LFT’s, £ and K, and 
their Ruelle operators Kg, Lg, the set CK, is formed of all hog with h € C and 
g G 1C, and the Ruelle operator relative to the set CK, is exactly the operator 
Kg o Lg. In particular, the Ruelle operator relative to the set TL* := Ufe>o’H^ is 
exactly J2k>o Hg)“^. This is the quasi-inverse of the Ruelle operator 

Hg associated with the set TL. 



3 . 4 . Ruelle operators and generating functions. We show now how the Ruelle 
operators intervene in the evaluation of the generating functions of costs G(s). 
We consider here a Euclidean Algorithm and its set of LFT’s TL together with 
its final set IF defined in Fig.l. The Ruelle operators Hg, , Fg, Fi'^^ relative 
to 'H or iF and defined in (g] El 0 will play a central role in the analysis. 

An execution of the algorithm on the input (iti,ito) of 17 performs k external 
steps of the form uq = niiui + d\U2, ■ • ■ Uk-i = rukUk + dkUk+i and de- 

composes the rational (ui/uq) as (ui/uq) = hi o /12 o . . . o hk(a), where the 
hfs are elements of TL (for i < k — 1 ) or elements of T (for i = k), and a is 
the last value of the rational. When given an index i,l < i < k, we consider 
now three different parts of the LFT h = hi o /12 o . . . o the beginning part 
bi(h) := hi o h2 o ... o hi-i, the ending part ei(h) := h^+i o hi+2 o . . . o hfc, and 
finally the i-th component hi. Then the sequence of the rationale (ut+i/ui) is 
defined from the relations 



o hi +2 o . . . o hk(a) = ei(h)(a), 

Ui 

and, since Ui and iti+i are coprime, the equality D[ei(h)](a) = Ui holds. 
With an operator Lg, we associate the operator ALg defined by 



ALg := 



— 1 d 
——-Li 



log 2 ds 



S • 
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When applied to Rs_/j defined in the functional A is well-suited to the 
problem since it produces at the numerator the logarithm log 2 D[h\{x). We then 
introduce the main operator of this work, 

fc-i 

and we claim that, when applied to function / = 1 and point x = a, this operator 
generates the cost C(ui,uo) of the algorithm on input (ui,mo), defined in (0, 

k—1 ^ 1 

Ss,?i[l](a) = ^ — c(hi) X log2Ui = ^ — c(hi) x log2 Ui = —C(ui,uo). 

'“o i=l '^0 "^0 

(we replace the upper index fc by fc — 1 thanks to equality = 1 that holds since 
Uk is just the gcd of uq and ui). Then, when (mi,mo) is a general element of 17, 
the LFT h is a general element of the set TL*J-, so that an alternative expression 
of the main Dirichlet series F, G defined in @ holds: 

F(s)^ Y, R...4l](a), G{s)= Y 

When the index i varies in — 1], beginning part bi(h) is a general element 
of "H*, ending part ei{h) is a general element of T~L*T, while the i-th component 
is a general element of "H, so that 

^ S,.;, = Z\[F,o(/-H,)->hWo(/-H,)^i, ^ R,.,, = F,o(/-H,)-\ 

where Hg,Fs are the Ruelle operators relative to the sets 'H,T used by the 
algorithm. We finally deduce expressions for the Dirichlet series F{s),G{s) that 
mainly involve the quasi-inverse of operator Hg : 

F(s) = Fgo(/-Hg)-Ml](a), (7) 

G(s) Fg o (/ - Hg)-i o Z\Hg o (/ - Hg)-i o hW o (/ - Hg)-i [l](a). (8) 

(Here, the symbol ~ means that we keep only the main term of the expression, 
i.e., the one that contains the largest number of occurrences of (/ — Hg)“^.) 

3.5. Functional Analysis. Here, we consider the following conditions on a set 
T-L of let’s that will entail all the properties that we need for applying the 
Tauberian Theorem to the quasi-inverse (7 — Hg)“^ of the Ruelle operator. 

Conditions Q(7t). There exist an open disk V whose closure contains the basic 
interval F [0, p], and a real a < 2 such that 

(Cl) For every LFT h G TL, h and \h'\ have an analytic continuation on V, and 
h maps the closure V of disk V inside V. 
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{C 2 ) There exists a convenient Banach functional space ^(V) formed with an- 
alytic functions on V such that each Ruelle operator Rs,/i (for h € TL) acts on 
J-{V) and is compact. Moreover, the series of norms converges on 

the plane K(s) > a. 

{C 3 ) For s = 2, the operator H 2 is a density transformer: for all / G 1F{V) 
positive oa V n R, one has Jj-H 2 [f]{t)dt = Jj- f(t)dt. 

(Ci) For some integer A, the set H contains a subset 

T> := {h\ h(x) = Aj(cF x) with integers c —>■ 00 }. 

If conditions Q('H) hold, the following main result proves that the quasi-inverse 
of the Ruelle operator which intervenes in the expressions (PE) of generating 
functions F(s) and G(s) fulfills all the hypotheses of Tauberian Theorem. More- 
over, the Dirichlet series G(s) admit a pole at s = 2 of order at least three. This 

[cl 

order is exactly three, if the Ruelle operator of costs Hs is regular at s = 2. 
However, the Ruelle operator of costs may have itself a pole at s = 2, so that 
the total order of pole at s = 2 becomes four. This implies a general criterion 
between “fast” algorithms with a log-squared behaviour and “slow” algorithms 
with a log-cubed behaviour. 

Theorem 1. Let (TL) be some Euclidean Algorithm that uses a set TL of LFTs 
for which conditions QCH) hold. 

(a) Then, the quasi-inverse (/— is analytic on the punctured plane {^(s) > 
2,s 2} and has a pole of order I at s = 2. Near s = 2, one has, for any function 

f of J-(V) positive on V fl R, and any a; G V fl R, 



(/-H,) ^[f]{x)^ (s-2) (9) 

where A(s) is the dominant eigenvalue of and ^2 is the dominant eigenfunc- 
tion of H 2 defined by the normalization condition Jjtfj 2 (x)dx = 1 . 

( 6 ) Suppose that the Ruelle operator of costs is regular at s = 2. Then the 
average bit-complexity of the Euclidean Algorithm on the set of valid inputs of 
denominator less than N is of asymptotic logarithmic-squared order. 



Bn{U) - Bn{U) - A{n) logl N with 






Here h{'H) is the entropy of the dynamical system relative to the algorithm 
and Eao[c] denotes the average value of cost c related to the LET h when the 
interval I is endowed with the invariant measure associated with the dominant 
eigenfunction of H2. 

(c) Suppose that the Ruelle operator of costs has a pole of order 1 at s = 2 
and the integral I(s) := [• 02 ](t)rft satisfies I(s)(s — 2 ) — >• A for s — >■ 2. 

Then the average bit- complexity of the Euclidean Algorithm on the set of valid 
inputs of denominator less than N is of asymptotic logarithmic-cubed order, 

BnCH) - BnCH) - A{n) log 2 N with A(U) = 
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Proof, (a) Under conditions (Ci) and (C2), the Ruelle operator Hg acts on 
J^(V) for 5R(s) > a and is compact (even nuclear in the sense of Grothendieck 
iniTn v Furthermore, for real values of parameter s, it has positive properties 
that entail (via Theorems of Perron-Frobenius style due to Krasnoselsky CS) 
the existence of dominant spectral objects: there exists a unique dominant eigen- 
value A(s) positive, analytic for s > a, a dominant eigenfunction denoted by '0s, 
and a dominant projector Cg. Under normalization condition eg [0s] = 1, these 
last two objects are unique too. Then, the compacity entails the existence of a 
spectral gap between the dominant eigenvalue and the remainder of the spec- 
trum, that separates the operator Hg and the quasi-inverse [I — Hg)“^ in two 
parts : the “part” relative to the dominant eigenvalue, and the “part” relative 
to the remainder of the spectrum. Under conditions (C3), the operator H2 is a 
density transformer, so that A(2) = 1 and e2[/j = Jj-f{t)dt. On the other hand, 
condition (C4) implies that the operator Hg has no eigenvalue equal to 1 on the 
line 5R(s) = 2, s yf 2. 

(6) Here, the Dirichlet series G(s) has a triple pole at s = 2, and near s = 2 

G{s) - (^)^ (£^H2[02](t)dt) (j^HW[02](t)dt). 

Both integrals are easily transformed. The first one 

J Z\H2[02](t)dt = J |log2t|02(t)dt 

equals — A'(2) / log 2 and coincides with h{'H) / (2 log 2), where h{'H) is the entropy 
of the dynamical system. The second integral deals with the cost c{h) of the LFT 

/ H^°'[02](t)dt = V c(/i) [ 02(t)dt 

Jl Jh{I) 

and coincides with the average value of cost c when the interval I is endowed 
with the invariant measure 4’2{t)dt. This average value is denoted by Eaa[c] and 
it is a constant of Khinchin’s type. 

(c) Here, the Dirichlet series G{s) has a pole of order four at s = 2. 

4 Average-Bit Complexity of the Algorithms 

We now come back to the analysis of the five algorithms, and we study succes- 
sively the fast algorithms, then the slow ones. 

4-1. The fast algorithms. We study here three algorithms: the Classical Algo- 
rithm (Q), the Centered Algorithm (/C) and the Binary Algorithm (B). We begin 
by the first two, that constitute “easy cases” . 

The Classical Algorithm and the Centered Algorithm. We first consider the sets 
Q, K. relative to the Classical Algorithm or the Centered Algorithm. There exists. 
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in both cases, an open disk V whose diameter strictly contains the basic interval 
I, such that each LFT h maps V strictly inside itself. A convenient functional 
space is then the set Aoo(V) formed with functions / that are analytic on V 
and continuous on V. Endowed with the sup-norm, this set is a Banach space 
and each Ruelle operator acts on this set and is compact. The series of 
(C 2 ) is convergent for 5J(s) > 1. Moreover, at s = 2, the Ruelle operators are 
density transformers. The (normalized) invariant functions ip 2 are explicit in the 
classical case and in the centered case. 



1 1 
log 2 1 -I- a; ’ 



1 



1 



1 



log (j) (j> + X (j)^ — X 



] with (f> = 



l-k\/5 



Now, we can apply Theorem 2, version (b). Here, since invariant function ^2 is 
explicit, the same is true for the entropy that equals 7r^/(61og2) in the classical 
case and equals 7 t^/( 6 log </>) in the centered case. Moreover, the average value 
Eoo [c] is also easily computed from explicit forms of cost c, of invariant function 
ip2 and distribution function E 2 . 

The Binary Algorithm. The set B of LFT’s relative to the Binary algorithm is 
more difficult to deal with (see Pj, [23|)- First, it is not possible to find an open 
disk whose diameter contains the basic interval I := [0,1] and on which all the 
LFT’s are analytic. The reason is that the sequence of poles of LFT’s is of the 
form X = —ml2“ and has an accumulation point at x = 0. We choose for V an 
open disk of diameter ]0,a[ with 1 < a < 2, and a convenient functional space 
is then the Hardy space of order two relative to V. It is denoted by "H^(V) and 
is formed with all functions / analytic inside V and such that |/p is integrable 
along the frontier of V. Each Ruelle operator acts on this set and is compact. 
The series of (C 2 ) is convergent for 3?(s) > (3/2). 

Now, we can apply Theorem 2, version (h) as previously. However, the dynamical 
system with which the algorithm is associated is now more complex, since it is 
a random dynamical system. The reason is that the pseudo-division is related 
to dyadic valuation, so that the binary continued fraction expansion is only 
defined for rationale numbers. However, one can define random binary continued 
fraction for real numbers when choosing at random the dyadic valuation fc of a 
real number, according to the law Pr[A: = d] = 2“"^ (for d > 1) that extends the 
natural law on even integers. In this manner, we choose the LFT of determinant 
2^ with probability 2“^, and, for LFT’s of determinant 2^, the quantity 
D[h]{x)~'^ f o h{x)dx = 2~^\h' {x)\ f o h{x)dx 
represents exactly a random change of variables. Then, the Ruelle operator can 
be viewed as the transfer operator relative to this random dynamical system and 
for s = 2 it is a (random) density transformer. Furthermore, even if the invariant 
eigenfunction ip2, the distribution function F2 are no more explicit, the entropy 
and the average value Eoo[c\ can be expressed as functions of F2 and 

Theorem 2. The average bit-complexities of the Classical Algorithm (Q ), the 
Centered Algorithm (1C) and the Binary Algorithm (B ) on the set of valid inputs 
of denominator less than N are of asymptotic logarithmic-squared order. They 
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satisfy, for % G {Q,1C,B}, 

BnCH) ^ Bn{H) ^ A{H) log^ N with A{n) = ^ E^[c] 

Here h{'H) is the entropy of the dynamical system relative to the algorithm and 
Eoo [c] denotes the average value of cost c related to the LET h when the interval 
X is endowed with the invariant measure. In the classical case (G), the cost c(h) 
equals £(m) + 2 where £(m) is the number of bits of digit m, and 



A(G) 



61og^2 



[ 2 - 



log 2 



log + 



k=0 



in the centered case (1C), the cost c{h) equals £{m) -|- 2 -|- (1 — e)/2 where £{m) 
is the number of bits of digit m, and e = ±1 the sign used, so that 



A{JC) 



61og(^log2 . log2 1 fr (2'1 :l 1)^1+^1 
TT^ ^ log</> log</> (2^ — i)<A^ — 2 ^ 



In the binary case (B), the cost c{h) equals b(m) + 2 where b(m) is the number 
of ones in the binary expansion of digit m, and 



A{B) = ^^ [1 
Tr^ip2[f) 



E 

m odd>l 



2 ^( 



m 



where 'ip 2 (t)dt is the invariant measure, E 2 its distribution function, and, as 
before, £{m) is the number of bits of integer m. 



4-2. The slow algorithms. We study now the Subtractive Algorithm (T) and the 
By- Excess Algorithm (£). In these cases, the analysis requires a special twist 
that takes its inspiration from the study of intermittency phenomena in physical 
systems that was introduced by Bowen and is nicely exposed in a paper of 
Prellberg and Slawny uni- Even if the Subtractive Algorithm can be directly 
analyzed, it is clearer to describe both analyses inside the same framework. 
First, we remark that each (internal) step of the Subtractive algorithm may use 
two let’s p{x) := x/(l -I- x),q{x) := 1/(1 -I- x), depending if the subtraction is 
followed or not by an exchange. 

In both cases (T, £), the set of LFTs H does not fulfill conditions Q('H) since it 
contains one “bad” LET which possesses an indifferent point, i.e., a fixed point 
where the absolute value of the derivative equals 1: this is p : x — >■ 1/(2 — x) for 
C and p : x — >■ x/(l -I- x) for T. In this case, it is not possible to find an open 
disk V that contains the basic interval I and such that the LET p maps V inside 
V. However, the remainder of the set, i.e., the set TL \ {p} is well-behaved, so 
that we adapt the method of inducing that originates from dynamical systems 
theory. The main idea is to consider a sequence of p to be followed by a good 
LET q that belong to 'H\ {p}. This is always possible since final set T does not 
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contain p. In that way, instead of set %, we use set M := p* [H \ {p}] that now 
fulfills conditions Q{A4). 

A sequence of iterations that uses an element p'^q of Ai with k > 0 and q ^ p 
deals with successive integers ut (for 1 < t < fc) that are slowly decreasing, 
so that the global cost of the sequence of iterations is well-approximated by 
/clog 2 Mo- Finally, one can apply Theorem 2, version (c), to the set M, and the 
integral relative to the Ruelle operator of costs [^ 2 ](^)dt is a divergent 

series at s = 2 with residue A equals 1/(2 log 2) in the £-case and 1 / log 2 in the 
T-case. 

Theorem 3. The average bit-complexities of the By-Excess Algorithm (C) and 
the Subtractive Algorithm (E) on the set of valid inputs of denominator less 
than N are of asymptotic log-cubed order. They all satisfy 

Bn{%) - Bm{%) ~ A{U)\og^N with A{T) = A{C) = 

7T^ 7T^ 
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Abstract. A considerable number of asymptotic distributions arising in 
random combinatorics and analysis of algorithms are of the exponential- 
quadratic type (e~^ ), that is, Gaussian. We exhibit here a new class 
of “universal” phenomena that are of the exponential-cubic type (e*^ ), 
corresponding to nonstandard distributions that involve the Airy func- 
tion. Such Airy phenomena are expected to be found in a number of 
applications, when confluences of critical points and singularities occur. 
About a dozen classes of planar maps are treated in this way, leading to 
the occurrence of a common Airy distribution that describes the sizes of 
cores and of largest (multi)connected components. Consequences include 
the analysis and fine optimization of random generation algorithms for 
multiply connected planar graphs. 



Maps are planar graphs presented together with an embedding in the plane, 
and as such, they model the topology of many geometric arrangements in the 
plane and in low dimensions (e.g., 3-dimensional convex polyhedra). This paper 
concerns itself with the statistical properties of random maps, he., the question 
of what such a random map typically looks like. We focus here on connectiv- 
ity issues, with the specific goal of finely characterizing the size of the highly 
connected “core” of a random map. 

The bases of an enumerative theory of maps have been laid down by Tutte 
in the 1960’s, in an attempt to attack the four-colour conjecture. The present 
paper builds upon Tutte’s results and upon the detailed yet partial analyses of 
largest components given by Bender, Richmond, Wormald, and Gao [211 1 ] . We 
establish the common occurrence of a new probability distribution, the “map- 
Airy distribution”, that precisely quantifies the sizes of cores in about a dozen 
varieties of maps, including general maps, triangulations, 2-connected maps, etc. 
As a corollary, we are able to improve on the complexity of the best known ran- 
dom samplers for multiply connected planar graphs and convex polyhedra m 

The analysis that we introduce is largely based on a method of “coalesc- 
ing saddle points” that was perfected in the 1950’s by applied mathemati- 
cians [.‘-119411 1 and has found scattered applications in statistical physics and 
the study of phase transitions nni. However, this method does not appear to 
have been employed so far in the field of random combinatorics. We claim some 
generality for the approach proposed here on at least two counts. First, a num- 
ber of enumerative problems are known to be of the “Lagrangean type”, being 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 388-^1 2000. 
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related to the Lagrange inversion theorem and its associated combinatorics. The 
classical saddle point method is then instrumental in providing asymptotics of 
simpler problems. However, confluence of saddle points is a stumbling block of 
the basic method. As we show here, planar maps are precisely instances of this 
special situation. Next, the method extends to the analysis of a new composition 
scheme. Indeed, it is known, in the realm of analytic combinatorics, that asymp- 
totic properties of random structures are closely related to singular exponents 
of counting generating functions. For “most” recursive objects the exponent is | 
and the probabilistic phenomena are described by classical laws, like Gaussian, 
exponential, or Poisson. Methods of the paper permit us to quantify distribu- 
tions associated with singular exponents | present in maps and unrooted trees 
and leading to Airy laws. 

Very roughly, the classical saddle point method gives rise to probabilistic 
and asymptotic phenomena that are in the scale of and the analytic ap- 
proximations are in the form of an “exponential-quadratic” (e~^ ) correspond- 
ing to Gaussian laws. The coalescent saddle-point method presented here gives 
rise to phenomena in the scale of with analytic approximations of the 
“exponential-cubic type” ), which, as we shall explain, is conducive to Airy 
laws. The Airy phenomena that we uncover in random combinatorics should 
thus be expected to be of a fair degree of universality. To support this claim, 
here are scattered occurrences of what we recognize as Airy phenomena in the 
perspective of this paper: the emergence of first cycles and of the giant compo- 
nent in the Erdos-Renyi graph model usi, the enumeration of random forests 
of unrooted trees ca, clustering formation in the construction of linear probing 
hash tables cni, the area under excursions and the cumulative storage cost of 
dynamically varying stacks ca, the area of certain polyominoes [7|, path length 
in combinatorial tree models m. and (we conjecture) the threshold phenomena 
involved in the celebrated random 2-SAT problem We propose to elaborate 
on these connections in future papers. 

Plan of the paper. Basics of maps are introduced in Section Ewhere the Airy 
distribution is presented. The enumerative theory can be developed along two 
parallel lines, one Lagrangean, the other based on singularity analysis. We first 
approach the analysis of core size via the Lagrangean framework and variations 
on the saddle point method: a fine analysis of the geometry of associated complex 
curves is shown to open access to the size of the core, with the Airy distribution 
arising from double or “nearby” saddles (Section E|); a refined analysis based 
on the method of coalescent saddle points then enables us to quantify the dis- 
tribution of core size over a wide range with precise large deviation estimates 
(Sectional). The method applies to about a dozen of types of planar maps, it 
provides a precise quantification of largest components, with consequences on 
the random generation of highly connected planar graphs (Section 0. Finally, 
we show that the very same Airy law is bound to occur in any instance of a 
general composition scheme of analytic combinatorics ( Section |3). 
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1 Basics of Maps 



A map is a planar graph given together with an embedding in the plane consid- 
ered up to continuous deformations. Following Tutte, we consider rooted maps, 
that is, maps with an oriented edge called the root — this simplifies this anal- 
ysis without essentially affecting statistical properties (see ^Zj and Section EJ . 
Generically, we take M and C to be two classes of maps, with Mn, Cn the sub- 
sets of elements of size n (typically elements with n -I- 1 edges) . Here, C is always 
a subset of A4 that satisfies additional properties {e.g. higher connectivity). The 
elements of A4 are then called the “basic maps” and the elements of C are called 
the “core-maps”. We define the core-size of a map m G A4 as the size of the 
largest C-component of m that contains the root of m. As a pilot example, we 
shall specialize the basic maps to be the class of nonseparable maps {i.e., 
2-connected loopless maps) with n -I- 1 edges and Ck to be the set of 3-connected 
maps with fc -|- 1 edges. 

Our major objective is to characterize the probabilistic properties of core-size 
of a random element of Mn, that is, of a random map of size n, when all elements 
are taken equally likely. Core-size then becomes a random variable A„ defined 
on Mn- In essence, the pilot example thus deals with 3-connectivity in random 
2-connected maps. The paradigm that we illustrate by a particular example is 
in fact of considerable generality as can be seen from Sections Eiini below. 

The physics of maps. From earlier works larm , it is known that a random 
map of Mn has with high probability a core that is either “small” (roughly of size 
k = 0(1)) or “large” (being 0{n)). The probability distribution Pr(A„ = k) thus 
has two distinct modes. The small region (say k = o(n)) has been well quantified 
by previous authors, see I2I11I1XI : a fraction ps = || of the probability mass is 
concentrated there. The large region is also known from these authors to have 
probability mass p£ = 1 — Ps = concentrated around agn with Oq = | but 
this region has been much less explored as it poses specific analytical difficulties. 
Our results precisely characterize what happens in terms of an Airy distribution. 

The Airy function Ai(z), as introduced by the Royal Astronomer Sir George 
Bidell Airy, is a solution of the equation y” — zy = 0 that can be defined by a 
variety of integral or power series representations including |23|: 






r((n+l)/3) . 2(n+l)7T 



sm • 



n\ 



(31/3, 



( 1 ) 

Equipped with this definition, we present the main character of the paper. 

Definition 1. The (standard) “map-Airy” distribution is the probability distri- 
bution whose density is 



A{x) = 2 exp ( 



{xM{x^) — Ai'(a;^)) . 



The “map-Airy” distribution of parameter c is defined by its density, cA{cx). 
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Fig. 1. (i) The map Airy distribution, (ii) Observed frequencies of core-sizes k G 

[20, 1000] in 100,000 random maps of size 2000, against predictions of ThmsEl S 



Note the nonobvious fact that the map-Airy distribution is a probability dis- 
tribution, i.e., dx = 1, which can be checked by Mellin transform tech- 

niques. An unusual feature is the fact that the tails are extremely asymmetric: 
A{x) = O , as a: — >■ — oo, and A{x) = O {x^/'^ exp (— as a; — >■ -boo. 

We shall find that the size of the core (conditioned upon the large region) is de- 
scribed asymptotically by an Airy law of this type; see Figure [D 

The combinatorics of maps. Let Mn and Ck be the cardinalities of Ain and 
Cfe. The generating functions of Ai and C are respectively defined by 

M{z) := M„z”, and C{z) := Y . 

rt>l 

(i) Root-face decomposition. As shown by Tutte, there results from a root-face 
decomposition and from the quadratic method 13 Sec. 2.9] that the generating 
function of M(z) is Lagrangean, which means that it can be parametrized by a 
system of the form 

M{z) = il){L{z)) where L(z) = z4>{L(z)), (2) 

for two power series 'tp, cp, with L being determined implicitly by <p- For nonsep- 
arable maps, we have cp{y) = (1 -b y)^, tp{y) = y{l — y). There results from the 
form 0 and from the Lagrange inversion theorem 13 an explicit form for the 
coefficients of M{z), namely, 

Mn^[z-]M{z)=^n'{y)cP{yr, (3) 

where [z'^]f{z) denotes the coefficient of z" in the series expansion of f(z). For 
nonseparable maps, this instantiates to M„ = [z”]M(z) = n<^{^n+ 2 )\ ■ 

(ii) Substitution decomposition. As shown again by Tutte, maps satisfy addi- 
tionally relations of the “substitution type”: one has: M{z) = -b + 

C{M{z)), meaning that each map (left part) either has no core (right part, the 
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first term) or is formed of a nondegenerate core in which maps are substituted 
(right part, the second term). This equation effectively gives access to the exact 
enumeration of objects of type C that are more “complex”, i.e., more highly 
connected than the initial maps of M. 

Our interest lies in the probability Pr(X„ = k) that a map of Mn has 
a core with k + 1 edges. Let A4n,k be the set of maps with this property; we 
define the bivariate generating function M{z, u) = k ^^n,kU^z^ ^ with ^ = 
card(A^„,fc). Tutte proved the following refinement: M{z,u) = C{uM{z)). This 
determines the probability distribution of the core-size: 



Pr(X„ = k) 



Ck [z"]M(z)'= 
Mn 



[z^]M{z)^ 



^]#'(y)V’(y)'' V(y)”, (4) 



where the second equality results from Lagrange inversion. 

All the involved generating functions are algebraic functions leading to com- 
plicated alternating binomial sums expressing Pr(A„ = k). The exponential 
cancellations involved are however not tractable in this elementary way, and 
complex asymptotic methods must be resorted to. 

The asymptotics of maps. There are here two sides to the coin: one evoked 
now and explored further in Section El relies on singularity analysis 0, a method 
that establishes a general correspondence between the expansion of a generating 
function at a singularity and the asymptotic form of its coefficients; the other 
discussed in the next two sections makes use of the power forms provided by the 
Lagrange inversion theorem that can be exploited asymptotically by the saddle 
point method. 

An implicitly defined function like L{z) in (0 has a singularity of the square- 
root type L{z) = T — c(l — zjp)^^^ 0{1 — z/p), where the singularity p and the 

singular value r are determined by the equations rf/ { t) — <f>{T) = 0 , p = 0 ^- 
This expansion yields the singular expansion of the generating function of maps, 

M{z) = iP{t) - a(l - z/p) 6(1 - z/p)^^'^ 0((1 - z/p)'^) (5) 



(in all known map-related cases, one has 'ip'(r) = 0 which induces the singu- 
lar exponent of 3/2). According to singularity analysis (or the Darboux-Polya 
method), this last expansion entails 



Mn 



4V^n5/2’ 2’^ 27’ ” 2^ [4 



-5/2 



( 6 ) 



Finally, this approach also yields by direct inversion the asymptotic number of 
core maps: C(z) has a singularity at //(r) and singular exponent |. We have 

C(z) = Co - a'(l - z/tp(T)) + 6'(1 - z/f}{r)f^'^ + 0((1 - z/f){j)f), (7) 



Cfe = [z^]C{z) 



36' 

40r 



V-(r)-'=fc-5/2, ^(3-co„„.) 



2430t 



4fe;,-5/2 (g) 



The foregoing discussion is then conveniently summarized by a statement that 
constitutes the starting point of our analysis. 
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Proposition 1. The distribution of the size of the core in nonseparable maps is 
characterized by Eq. W’ where the core map counts Ck are determined asymptot- 
ically by (0). The basic maps in M.n are enumerated exactly by (H and asymp- 
totically by (0). 

2 Two Saddles 

The probability distribution of core-size in maps is determined by Proposition 0 
especially by Equation 0- What is needed is a way to estimate [z'^]M{z)^. The 
approach starts from the contour integral representation deriving from Cauchy’s 
coefficient formula, 

^ 1 /* f] 7' ^ 1 /* 

[z-]M'^{z) = j^G{z)^{zn^{z)/zr dz 

where T is a contour encircling the origin anticlockwise and G{z) = if' {z) / ip{z) = 
{l-2z)/{z{l- z)). 

In simpler cases, integrals over complex contours involving large powers are 
amenable to the basic saddle point method. The idea consists in deforming the 
contour T in the complex plane, this, in order to have it cross a saddle point of the 
integrand ( j.e., a zero of the derivative) and to take advantage of concentration of 
the integral near the saddle point. Then local expansions are of the “exponential 
quadratic” type and the (real-variable) Laplace method permits one to estimate 
the integral asymptotically jS]. 

For the problem at hand, there are two saddle points, given by the equation 
-^{khnf -\- n\w{(j)/ z)) = 0: 

z+{n,k) = \ and z_ (n, fc) = . 

2 n-\- k 

The basic saddle point method applies when these two points are distinct, that 
is, as long as k/n is “far away” from This corresponds to the situation already 
well-known from the works of mm . The “interesting” region is however when 
k = n/3 and when k is close to n/3 in the scale of In that case, the basic 
version of the saddle point method is no longer applicable. This is precisely where 
we fit in: we prove that a detailed examination of the analytic geometry of the 
saddle points in conjunction with suitable integration contours “captures” the 
major contributions and leads to a precise quantification of core-size in random 
maps. 

Distinct saddles When k is far enough from n/3, one of the saddle points 
is nearer to the origin and predominates. In that case, the basic method ap- 
plies using a contour that is a circle centered at the origin, passing through the 
dominant saddle point. This corresponds to the already known results of I2CI1 
supplemented by HB|. 

Theorem 1 (Tails and distinct saddles Let A(n) be an arbitrary func- 

tion with A(n) — >■ - 1-00 and A(n) = o(n^/^). Then, the probability distribution of 
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the core of random element of satisfies 

32 

Pr(X„ = fc) - ■ fc 3 / 2 (^_ 3 fc) 5 / 2 ’ uniformly for A(n) <fc< f-n^/^AH 

Ft{X„ = k) = O (exp(— n(fc/n — 1/3)^)) , uniformly for fc > | + n^/^A(n). 

Proof (Sketch). The left tail (n < Sk) corresponds to the saddle point | that is 

dominant (i.e., nearer to the origin and providing the major asymptotic contribution). 
The right tail (n > 3k) has = (n — k)/{n + k) dominating. In each case, the basic 
saddle point method applies. 

A double saddle Here we attack directly the analysis of the “center” of the 
distribution, that is, the case where n = 3k exactly. Then, the saddle points 
become equal: = z_|_. This case serves to introduce with minimal apparatus 

the enhancements that need to be brought to the basic saddle point method. 
Observe that the complete confluence of the saddle points precludes the use of 
“exponential-quadratic” approximations and the problem becomes of an “expo- 
nential cubic” type. (See also |3 for a partial discussion of this case based on a 
method of Van der Corput.) 

Theorem 2 (Central part and a double saddle). The probability distribu- 
tion of the core of random element of Mn satisfies, when n = 3k, 

Pr(Va = t) = (l + 0((l„(fc))*fc-'«)) , ±(TM 0.0531. 

Proof. When n = 3fc, equation 0 becomes 

[z^^]M\z) = ^ ^ G{z)P{z)^dz, (10) 

where P{z) fjcjF / z^ = (1 — z){l + «)®/z^ and the “kernel” ln(P) (together with 

P,P^) now has a double saddle point at r = 2 _ = 2 + = |, sometimes called a 
“monkey saddle”, viz., a saddle with places for two legs and a tail. The idea consists 
in choosing a contour that is no longer a circle centered at the origin, but, rather, 
approaches the real axis at an angle. Specifically, the integration path P consists of 
the following: the part To of a circle centered at 0 from which a small arc is taken out, 
joining with two (small) segments Z\i, A 2 of length <5 that intersect at | at an angle of 
±27 t/3. 

We shall adopt a value of S satisfying two conflicting requirements, 

nS^ — >■ 00 , nS* — >■ 0, specihcally S = (lnn)n~^^^. (11) 

The kernel ln(P) has a double saddle point in r, meaning that its local expansion is of 
the cubic type: 

ln(P( 2 )) = ln(P(r)) - d{z - r)® -b 0{{z - r)"‘), 

The geometry of the level curves of the kernel shows that the contribution £q along 
To to the integral in dD is bounded by a constant times the value of P{z)^ at the 
endpoints of To- This contribution then satisfies 

£0 = f G(z) exp[fc ln(P( 2 ))] dz = 0 (P(t)^ exp{—kdS^)), 

J To 
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which, given the constraints on <5 (condition n<5® — >■ oo in (1111) 1 is exponentially small. 

The contribution fi ,2 along Ai U A 2 to the integral in dm provides the dominant 
contribution and is estimated next by a local analysis of for values of z near r. 
Set u = z — T. The condition nS* — ^ 0 in (11111 implies that terms of order 4 and 
higher do not matter asymptotically, and a simple calculation, using the fact that 
G{t + m ) = —8u + 0{v?), yields 

fi ,2 = f G{z)exp[k\n{P{z))]dz = —8P{t)'^ f uexp (^—kdu^) {l + 0{kS'^)) du . 

J A1UA2 J A1UA2 

The integral along Ai U A 2 can be extended to two full half lines of angle ±27 t/ 3 
emanating from the origin, this at the expense of introducing only exponentially small 
error terms (since nS^ 00 ). The rescaling v = exp(2i7r/3) on Zii and v = 

u(fcd) exp(—2i7r/3) on Z \2 then shows that the completed integral equals 

r+oo ■ 

(fcd)-"/®(e^'’^/® - / wexp(-u®)dw = -(fcd)-"/®^r(2/3), 

Jo V 3 

where the evaluation results from a cubic change of variable. In summary, we have 
found 

[z-\M^{z) = ^ (£-0 + fi,2) = ^ (1 + o{k5^)) , 

which, given our choice of 5, is equivalent to the statement. 



A similar reasoning proves that the estimate remains valid for n = 3k + e 
with e = 1 or e = 2, and more generally with any e satisfying e = 0(1). 

Nearby saddles When k is close to n/3, we choose in the representation (0 
an integration contour F that catches simultaneously the contributions of the 
two saddle points z_ and For this purpose, we adopt a contour that goes 
through the mid-point, ^ := (z_ -I- z+)/2, and, like in the previous case, meets 
the positive real line at an angle of ±27 t/ 3. Local estimates of the integrand, once 
suitably normalized, lead to a complex integral representation that eventually 
reduces to Airy functions. 



Theorem 3 (Local limit law and nearby saddles). The probability distri- 
bution Pr(Xn — k) admits a local limit law of the map-Airy type: for any real 
numbers a, b, one has 



sup 



r"/3pr(x„ = fc) 



16 3^1/3 p4/3 



0 . 



Proof. We set k = n/3 -I- xnf^^ where x lies in a finite interval of the real line, and 
define H := ^ j z) (this replaces ln(P) in the previous argument). The starting 

point is again the integral representation (0 taken along a contour P that comprises 
Jo, a circle minus a small arc, together with two connecting small segments A\,A 2 
of length (5, now meeting at <(, where 5 is chosen according to the requirement cn). 
The arc Pq lies below the level curve of and the corresponding contribution £0 is 
estimated to be exponentially negligible. 

We turn next to the contribution £ 1^2 arising from Z\i U Z\ 2 . The distance between 
the two saddle points z-,z+ is which represents the “scale” of the problem. 
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One thus sets z = C, + . Local expansions of H and G are then best carried out 

with the help (suitably monitored!) of a computer algebra system like Maple. The com- 
putation relies on the assumption x = 0(1), but some care in performing expansions 
is required because of the relations HU- We find eventually 

= 4"''n“^/®exp J ^ (9x/2-8v) exp (-^v^ -^x^v^(l + ri) dv, 



where the error term -q satisfies q — 0(<5'’n -|- and the segments A{, A '2 each 

have length tending to infinity according to our assumptions. Perform finally 

the change of variable v = (^) t and complete the integration path to 
the integral then reduces to Ai(a;), Ai'(a;) through contour integrals representations 
equivalent to JU (by Cauchy’s theorem, with integration path changed to 

Thus, for x = 0(1) and k = n/3 -I- , the main estimate found is 








4 




(1 + 0 ( 1 )), 



where A{x) is the map- Airy density function. This form is equivalent to the statement. 
The argument also gives a speed of convergence to the limit law of 0(n“^+^+°^^^). 



3 Coalescing Saddles 

In the present section, we provide a uniform description of the transition regions 
around n/3, allowing k to range anywhere o(n) and n — o(n), precisely, between 
A(n) and and n — A(n), for any A(n) = o{n) with A(n) — >■ 00 . For the study of 
this wide region in the scale of n, we set 

k = aoTi + j3n = (1/3 -I- /3)n, 

with estimates valid uniformly for (3 in any compact subinterval of ] — | | 

Theorem 4 (Large range and coalescent saddles). Let k = n{l/3 + P) , 
and 7,01,04 be the functions of P given below. Let A(n) be any function with 
\{n) = o{n) and \{n) — >■ -boo. Then, with \ Pr(X„ = n/3 + Pn) equals 

81(1 + (f ■’'<»> + “P (- a’) 4 + 0 (!/»)) . 

( 12 ) 

where the error term is uniform for P in any compact subinterval of] — |, |[ 
and, up to replacing 0{l/n) by 0{X{n)~^) , it is also uniform for any k > A(n). 
With L{x) = x\nx, the quantities 7 , oi, and 04 are: 

( 1 1 9 A 

7 = f 2£(1 + 3/3/4) - -£(1 - 3/3/2) - -£(1 + 3/3) - -/31n2j (13) 

^/t _ 4 [1 _ ai 

2 8 V(l + 3/3/4)(l-3/3/2)(l-b 3/3)// ^ 9 p'^\j P 472^ ^ 
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The estimates involve Airy functions composed with the quantity x that depends 
nonlinearly on (3. In particular, formula (HHl extends the estimates of Section 0 
when fc = n/3 + since in that case y oc x while /3 — >■ 0 and the following 

approximations apply: 

7 = ^— /3 + 0(/3^), ^ = — ^3^/^ + 0(/3), /3 — >■ 0. 

(The resulting speed of convergence to the Airy law appears to be 0(n“^/^).) 
As soon as k leaves the n/3 ± region, the two Airy terms in (1 1 211 start 

interfering and large deviations are then precisely quantified by (1 1 211 . When k 
drifts away to the left of n /3 (and x — >■ — oo), basic asymptotics of Airy functions 
show that the formula simplifies to agree with the results of Section 0 

Proof. The transition phenomenon to be described is the coalescence of two simple 
saddle points into a double one; see f- 924 \ . The simplest occurrence of the phenomenon 
appears in the integration of exp[n/(t, 7)] with 

f'ikl) - 7 ^- 

Indeed in this case there are two saddle points ±7, coalescing into a double saddle 
point as 7 — >■ 0 . The strategy consists in performing a change of variable in order to 
rednce the original problem (jH) to this simpler case. Denote the kernel of the integral 
as H{z,P) = with k = (l /3 + / 3 n) and the dependency on (3 made explicit. 

The integral in (0 is 

I(n,P) = J G{z) exp[nH{z, P)]dz, 

and we seek a change of variable of the form 

H{z,P) = - {t^/ 3 -y^t) + r = ( 15 ) 

It turns out that, taking 7 = 7(/3) to be the real cubic root of 7® = j[H{z+,P) — 
H{z-,P)], (the relation is expressed by II 31 ) i and r = r(/ 3 ) to be 

r = ^[H{z+,P) + H{z-,P)] = H{z+,P) - ^7® = / p) - ^7®, ( 16 ) 

there exists a conformal map z ^ t from the disc D of diameter [|, |] to a domain 
Dff satisfying (11 ail and mapping z± onto ±7. For simplicity, we restrict /3 to [— 1 , 1 ]. 
The domain Dp contains the disc D' of diameter [— |, |]. Let us denote by z{t) the 
inverse mapping and Go{t, P) = G{z{t))z{t) where z{t) = Remark that Go{t, P) is 
regular in D' . To guide his intuition, the reader may think of the map « — ^ t as a slight 
deformation of the map 2 — >■ 2(z — r). 

Let us now proceed with the integral. As is usual with saddle point integrals we 
first localise the integral in D, neglecting the parts of the path down in valleys, 

I(n,P)= f G{z) exp[nH{z, P)]dz = f G{z) exp[nH{z, P)] dz + £i{n, P), 

Jr J rnD 

where £i{n, P) is exponentially negligible when n — ^ 00, uniformly in p. Inside the disc 
D we apply the change of variables dini), then restrict attention to the disc D' , and 
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deform the contour onto the relevant part of 3 , i > 0 }: 

I{n,(5)= f G{z{t))exp[nf{t,'r)]z{t)dt + £i{n,l3) 
J r^nDp 



L 



Go(t, /3)exp[n/(t, 7 )] dt + £ 2 {n, P). 



In order to evaluate this integral one needs to dispose of the modulation factor Go{t, j3). 
This can be done via an integration by part: A local expansion near 7 yields 

Go{t, /3) = (7 - f)ai + - 7^)Hoit, /3), 

where Ho{t,(3) is regular in D' , and ai is given by 111 41 . The integral I{n,/3) is thus 
I{n, 13) = exp{nr) / (7 — t)ai exp (— n (t^/3 — 7 ^t)) dt + i?o(u, /3), 

JAaoDD' 

where after integration by part, and up to another exponentially negligible term, 



Ro{n,P) = 



exp(nr) 



L 



AoonD' 



^Ho(t,P) ) exp 



I ^ 

-w 1 y - 7 i 



dt + £3(71,13). 



The integration by part has reduced the order of magnitude by a factor n, but Ro(n, (3) 
is amenable to the same treatment as I(n, (3). We shall content ourselves with the next 
terms: let ^Ho(t,f3) = + o-st + (t^ — ^'^)H\(t, (3), with H\(t,f3) regular in D' , 02 , 



03 functions of j3, so that we have 
I(n,(3) = exp(nr) J (^7 ^oi + j - t (^ai - j j exp 






dt + Ri(n, 13). 



where the integral has been extended to the whole of Aoa at the expense of yet another 
exponentially negligible term. The error term is 



D ( n\ exp(nr) 

Ri(n,l3) = ^ 

T) ^ 



L 



AoonD' 



^Hi(t,l3) ) exp 



I ^ 

-w 1 y - 7 i 



dt + £4(71, l 3 ). 



In terms of the Airy function, we thus have 

I(ti,(3) = (^ 7 n^^®^ai + ‘^jAi(n^/® 7 ^) - ^ai - ^ j Ai'(n^'^® 7 ^)^ +i?i(n,/3), 

and the error term Ri ( 71 , (3) can be estimated: there exist do and di positive such that 

ia(n,ffll < (^|A1(»"'7")| + ^|A1>“'7^)|) . 

The theorem follows from formulae ©, 0 ) Hlhl) and the definition of the map- Airy 
law, upon setting 04 = 7(02 + as). 



4 Applications to Maps and Random Sampling 

The results obtained in the particular case of 3-connected cores of nonseparable 
maps are instances of a very general pattern in the physics of random maps. 
Indeed all families in the table below obey the Lagrangean framework and are 
amenable to the saddle point methods developed in previous sections. 
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Table 1. A selection of composition schemes (A an edge, C,T> auxiliary families). 



maps (M), Mn 


cores (C), scheme 


ao 


c 


general, n edges 


nonseparable, At ~ C[AAt^j 


1/3 


3/4^/® 


general, n edges 


bridgeless. At ~C[A(AAt)*j 


4/5 


(5/3)^/®/4 


general, n edges 


loopless. At ~ £ + C[A((AAt)*)®] 


2/3 


3/2 


loopless , n edges 


simple, At ~ C[AAt] 


2/3 


34 /® /4 


bipartite, n edges 


bip. simples, At ~ C[AAt] 


5/9 


3®/®/20 


bipartite, n edges 


bip. nonsep., At ~ C[AAt^| 


5/13 (13/6)®/® • 3/1 


bipartite, n edges 


bip. bridgeless. At ~C[A(AAt)*| 


3/5 


(15/2)®/®/18 


nonsep., n edges 


simple nonsep., At ~ C[AAt] 


4/5 


15®/®/36 


nonsep., n -|- 1 edges 


3-connected, At ~ D + C[At] 


1/3 


34 /® /4 


cubic nonsep., n + 2 faces cubic 3-conn., A4 ~ C[A(1 -I- At)®] 


1/2 


(3/2)1/® 


cubic 3-conn., n + 2 faces cubic 4-conn., At — At • C[AAt®] 


1/2 


6^/®/3 



Theorem 5. Consider any scheme of Table 1 with parameters Oq and c. The 
probability Pr(ATji = k) that a map of size n has a core of size k has a local limit 
law of the map-Airy type with centering constant oq scale parameter c. 

The technique of im relates the size of the core to the size of the largest 
component in random maps. Also, since maps have almost surely no symmetries 
ca, the analysis extends to unrooted maps. As a consequence: 

Theorem 6. (i) Consider any scheme of Table 1 with parameters ao and c. 
Let X* be the size of the largest component of in a random map of size n with 
uniform distribution. Then 

Pr (^n = L«on + (1 + 

uniformly for x in any bounded interval. Furthermore, if x is restricted to the 
shorter range |a;| < A(n)“^ for a fixed function A(n) going to infinity with n, 
then 

Pi'(^n= [aon + xn'^/^l'^ = (1 + 0(A(n)~^)). 

(ii) The same results hold for random unrooted maps. 

Theorem El extends results of Bender, Gao, Richmond, and Wormald m 
who proved that Xf lies in the range agn ± A(n)n^/^ with probability tending 
to 1, where A(n) is any function going to infinity with n. 

Random sampling algorithms for various families of planar maps were de- 
scribed in nni. For general, nonseparable, bipartite, and cubic nonseparable 
maps, an algorithm Map is given there that takes an integer n and outputs 
in linear time a map of size n uniformly at random. For the other families of 
Table 1, a probabilistic algorithm Core described below is used. 
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Probabilistic algorithm Core (k) with parameter f(k) 

1. use Map(n) to generate a random map M G A4 of size n = f{k); 

2. extract the largest component C of M with respect to the scheme; 

3. if C does not have size k, then go back to step 1; 

4. output C. 

Safe for a set of measure that is exponentially small, this algorithm produces a 
uniform element of Ck- The expected number of loops made by Core is exactly 
in = Pr(X„ = k)~^. The results of the paper enable us to precisely analyse this 
and a number of related algorithms of jlHIEI. We cite just here: 

Theorem 7. In all extraction/rejection algorithms of m, the choice f{k) = 
n/oo yields an algorithm whose average number of iterations satisfies 

in /{A{Q)c). 

Let xq ~ 0.44322 be the position of the peak of the map-Airy density func- 
tion ((1 — 4xQ)Ai(a::Q) + 4a:gAi'(xQ) = 0). The optimal choice f{k) = fc/og — 
reduces the expected number of loops by 1 — A{ff) / A{xa) ~ 30%. 

This proves that the extraction/rejection algorithms have overall complexity 
as do variant algorithms of I18I19I that are uniform over all Ck- The 
complexity becomes 0{k) if some small tolerance is allowed on the size of the 
multiply connected map generated. Theorems of the paper enable us to quantify 
precisely various trade-offs and fine-tune algorithms (details in the full paper). 
As exemplified by Fig.Q(ii), the predictions fit nicely with experimental results. 

5 Composition of Singularities 

Map enumeration can be approached through the Lagrangean framework and 
the saddle point analysis developed so far takes off from there. An alterna- 
tive approach to the problem relies on singularity analysis |^, as introduced in 
Section 0 The results of this section contribute to the general classification of 
combinatorial schemas according to the nature of their singularities m- 

First, a definition. Let M and C be two generating functions with dominant 
singularities at p and a, such that M{z) = a — a{l — z/ p)-\-b{l — z/ pY^'^ — 
z/ pY), and C{z) = Cq — a' {1 — z / a)-\-b{\ — z / — z / ( jY) , in an indented 
domain extending beyond the circle of convergence (see j0|). Then the bivariate 
substitution scheme C{uM{z)) is said to be a critical composition scheme of 
type (3/2, 3/2). The functional composition C{uM{z)) describes the size of the 
C component in a combinatorial substitution C[A4]. The scheme is called critical 
since the singular value of the inner function (M) equals the singularity of the 
outer function (C). It will be recognized that Tutte’s construction is an instance 
(with a replacing the map specific ipir) of formulae Q and CD). Schemes of 
this broad form have been only scantily analysed, a notable exception being the 
critical composition scheme of type (—1,3/2) that shows up in ordered forests 
and in random mappings (functional graphs): in that case, the density is known 
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to be of the Rayleigh type iHEni . The results of this section somehow recycle 
in a different realm the intuition gathered by the method of coalescing saddles, 
although the technical developments are a bit different. 

Theorem 8. (i) For k = an + X{n), with 0 < a < oq = y o.nd A(n) = o{n), the 
prohahility distribution of the size X„ of the C-component of random element of 
C[M] of size n satisfies 

Pr(AT„ = fc) ~ for a > 0; 

Vv{Xn = k) ^ for a = 0 and A(n) — >• +oo . 

{ii) For k = agn + , uq = aja, x = an Airy-map law holds: 

nf/^FrtXn = aon + xr?!'^') ^ cAl(cx) where c = /oo- 

olq b \oa/ 

The proof relies on a modification of the Hankel contour used in classical singu- 
larity analysis together with a different scaling. It will be developed in the full 
paper. The theorem is a companion to Theorems □ QHIl that can also be used 
to analyse forests of unrooted trees HH in the critical region, a problem itself 
relevant to the emergence of the giant component in random graphs Pea- 
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Analysing Input/Output-Capabilities of Mobile 
Processes with a Generic Type System* 



Barbara Konig (koenigb@in.tum.de) 
Fakultat fiir Informatik, Technische Universitat Miinchen 



Abstract. We introduce a generic type system (based on Milner’s sort system) 
for the synchronous poly adic tt - calculus, allowing us to mechanise the analysis of 
input/output capabilities of mobile processes. The parameter of the generic type 
system is a lattice-ordered monoid, the elements of which are used to describe 
the capabilities of channels with respect to their input/output-capabilities. The 
type system can be instantiated in order to check process properties such as upper 
and lower bounds on the number of active channels, confluence and absence of 
blocked processes. 



1 Introduction 

For the analysis and verification of processes there are basically two approaches: meth- 
ods that are complete, but cannot be fully mechanised, and fully automatic methods 
which are consequently not complete, i.e. not all processes satisfying the property to be 
checked are recognised. 

One promising direction for the latter approach is to use type or sort systems and 
type inference with rather complex types abstracting from process behaviour. In the last 
few years there have been several papers presenting such type systems for the polyadic 
TT-cal cuius and other process calculi, checking e.g. input/output behaviour [15], absence 
of deadlocks [7], security properties [1,4], allocation of permissions to names [16] and 
many others. Types are compositional and thus allow reuse of information obtained in 
the analysis of smaller subsystems. 

One drawback of the type systems mentioned above is the fact that they are spe- 
cialised to check very specific properties. A much more general approach is a theory 
of types by Honda [6] which is based on typed algebras and gives a classification of 
type systems. This theory is very general and it is thus necessary to prove the subject 
reduction property and the correctness of a type system for every instance. Our paper 
attempts to fill the gap between the two extremes. We present a generic type system 
where we can show the subject reduction property for the general case, and by instan- 
tiating the type system we are able to analyse specific properties of processes. Despite 
its generality, our type system can be used to generate existing type systems, or at least 
subsets of them. With the introduction of residuation (explained below) we can even 
type some processes which are not typable by comparable type systems. 

We concentrate on properties connected to input/output capabilities of processes in 
the synchronous polyadic 7r-calculus. In our examples (see section 5) we check prop- 
erties such as upper and lower bounds on the number of active channels, confluence, 
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absence of blocked input or output prefixes. Determining these capabilities of a process 
involves counting and we attempt to keep this concept as general as possible by basing 
the generic type system on commutative monoids. Instantiating a type system mainly 
involves choosing an appropriate monoid, and monoid elements associated with input 
and output prefixes (e.g. for counting the number of prefixes with a certain subject). 

Instead of giving the precise answer to every question, our type system uses over- 
approximation (e.g. we can expect results of the form “there are at most two active 
channels with subject x at any given time”). Hence plain monoids are not sufficient, hut 
we need ordered monoids (so-called lattice-ordered monoids or 1-monoids), equipped 
with a partial order compatible with summation. 

There is a huge class of lattice-ordered monoids which are residuated, i.e. some 
limited form of subtraction can be defined. Residuation can be put to good use in pro- 
cess analysis. Consider, e.g. the process P = x.x.O. While P increases the number of 
occurrences of the output prefix x by one, it does not do so for the input prefix x, since 
we are interested exclusively in the number of prefixes on the outer level (i.e. in pre- 
fixes which are currently active) and x can only be reached by a communication with x 
which decreases the number of input prefixes in the environment by one. This decrease 
can be anticipated when typing P, and is taken into consideration by subtracting one 
from the number of input prefixes. 

The type of a process contains an assignment of names to sorts and a mapping of 
sorts to strings of sorts (as in [13]), keeping track of channel arities, i.e. if channel x has 
sort s, and n-ary tuples are communicated via x, then s will be mapped to a string of 
sorts having length n, being the sorts of the respective channels. Thus, successful typing 
also guarantees the absence of runtime errors produced by mismatching arities. Further- 
more a monoid element is assigned to each sort s. The monoid element is expected to 
be an upper bound for the capabilities of all channels having sort s. 

2 Preliminaries 

2.1 The TT-Calculus 

The TT-cal cuius [12, 13] is an influential paradigm describing communication and mo- 
bility of processes. In this paper we will consider the synchronous polyadic 7r-calculus 
without choice and matching, and replication is only defined for input prefixes. Its syn- 
tax is as follows: 

P ::= 0 I [i^x: s)P \ P1IP2 | x{z).P \ x[y).P \ \x[y).P 

where s is an element from a fixed set of sorts S and x is taken from a fixed set of names 
TV", y = yi ■ ■ - yn and z = zi ... Zn are abbreviations for sequences with elements from 
J\f. We call x{z) output prefix and x{y) input prefix. 

The set of all free names (i.e. names not bound by either or by an input prefix) of 
a process P is denoted by/«(P). The process obtained by replacing the free names yi 
by Xi in P (and avoiding capture) is called P{x/y\. 

Structural congruence is the smallest congruence obeying the rules in the upper part 
of table 1, and equating processes that can be converted into one another by consistent 
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renaming of bound names (ct-conversion). We use a reduction semantics as for the 
chemical abstract machine [ 2 ] instead of a labelled transition semantics. 



c 

tj 



(C-COM) Pi\P2 = P2\Pi (C- 0 ) P |0 = P 

(C-ASS) P1KP2IP3) = (Pl|P 2 )|P 3 
(C-Restr 1 ) {ux ■. s){iiy. t)P = {uy. t){iix ■. s)P if x ^ y 
(C-Restr 2 ) ((ux: s)Pi)\P 2 = (ux: s)(Pi IP2) if 2: ^fn{P2) 



Co 



(R-COMM) x{z).Q I x(y).P Q I P{z/y} 



>1 

S 






(R-Rep) x{z).Q I \x(y).P Q I P{z/y} \ \x(y).P 



•S 









(R-Par) 



P ^ P' 
P\Q^P'\Q 



(R-Restr) 



p ^ p' 

{izx: s)P {izx: s)P' 



(R-EQU) 



Q = P,P ^ P',P' = Q' 
Q — )■ 



Table 1. operational semantics of the tt - calculus 

Consider the following processes which we will use as an example in this paper (we 
omit the final 0 ): 

F = c(r) ,d{r) .d(a) ,c{a) S = d(s).s(hi, h2).d{hi) T = c{h).c(x) iJ = ^(11,12) 

There is a forwarder F which receives requests on a channel c, forwards them on a 
channel d to a server, receives the answer and sends it back on c. The server S receives 
requests on d, and we assume that these requests come with a name s where the server 
can get further information. The server obtains this information, processes it and sends 
the answer back on d (in our example we keep the “processing part” very simple, the 
server just sends back the first component). Furthermore T is a trigger process, starting 
the execution of F and receiving the result in the end, and H delivers information to 
the server. 

We can combine the processes F, S,T, H to obtain P as the entire system. If we 
want F and S to be persistent, we regard P'. 

P = T\H\ {izd: Sd){F I S') P' = T\H\ {izd: Sd){\F |!S) 

A programmer analysing this piece of code might be interested in the following 
properties: input/output behaviour, upper and lower bound on the number of channels 
being active, confluence properties and absence of blocked prefixes that never find a 
communication partner. E.g., examining P will reveal that at any given time every name 
is used for input and output at most once and that P is therefore confluent. 
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2.2 Residuated Lattice-Ordered Monoids 

Lattice-ordered monoids are a well-developed mathematical concept (see e.g. [3]). We 
are interested in commutative residuated 1-monoids in order to represent input/output 
capabilities. 

Definition 1. (Lattice-ordered Monoid) 

A commutative lattice-ordered monoid (1-monoid) is a tuple (^, +, <) where I is a 
set, I X I ^ I is a binary operation and < is a partial order which satisfy: 

— (/, -|-) is a commutative monoid, i.e. -\- is associative and commutative, and there 
is a unit 0 with 0 -|- a = a for every monoid element a ^ I. 

- {I, <) is a lattice, i.e. < is a partial order, where two elements a,b ^ I have a join 
( or least upper bound) aW h and a meet ( or greatest lower bound) a A b. 

— I contains a bottom element _L, the smallest element in I, and a top element T, the 
greatest element in I. 

- For a, b,c ^ I: a + (b\/ c) = (a -1- 6) V (a -1- c) and a + [b Ac) = (a -1- 6) A (a -1- c) 

Any 1-monoid (/,-!-,<) is associated with an 1-monoid (/,©,<) where a © 6 = 
(a + 6) V a V 6 and _L is the unit. The significance of © can be made clear with the 
following consideration: monoid elements will be used to label sorts, being an upper 
bound for the capabilities of channels having this sort. E.g., we assume that a free name 
X and a bound name y have sort s, indicating that, during reduction, x might replace y. 
The capabilities of x and y are a respectively b. What capability should be associated 
with s? In the presence of positive monoid elements only, a + 6 is the correct answer. 
If, however, a is negative, a + 6 is actually smaller than b and if x has not yet replaced 
y, the monoid element associated with s underestimates the capabilities of y. Since we 
use over-approximation the correct sort label is a © 6. 

Definition 2. (Residuated f-monoid) Let (!,+,<) be an l-monoid and let a, b e I. 
The residual a — b is the smallest x (if it exists) such that a < x + b. I is called 
residuated if all residuals a — b exist in I for a, b ^ I. 

Example: one residuated l-monoid which we will later use for the analysis of pro- 
cesses is 10 = ({none, I, O, both}, V, <) where none < / < both, none < O < both 
and the monoid operation is the join, i.e. the l-monoid degenerates to a lattice. A chan- 
nel name has for example capability O if it is used at most for output and capability 
both if it may be used for both output and input. 

In order to count the number of inputs or outputs we use the residuated l-monoid 
= (ZU {c», — c»}, +, <) with all integers including c» and — c» (c» + (— oo) = 
— c»). Residuation is subtraction for all monoid elements different from c» and — c». 
The cartesian product of two 1-monoids, e.g. Z°° x Z°°, is also an l-monoid. 



We use the following inequations concerning residuated 1-monoids: for all elements 
a, 6, c of a residuated l-monoid it holds that 



a < (a — b) + b (a + b) — b < a (a + b) — c < (a — c) + b 
(a + 6)V0<(aV0) + (6V0) a + b < a (B b _L + _L = _L T + T = T 



And we define: si 




_L if a < 0 
T if a > 0 



0 if a = 0 

undefined otherwise 
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3 The Type System and Its Properties 

We define the notion of types and type assignments which have already been informally 
introduced in section 1 . 

Definitions. (Type Assignment) Let S be a fixed set of sorts and let {!,+,<) be a 
fixed l-monoid. A type assignment r = obr]mr]Xi: (si/ai),... ,x„: {sn/af) (ab- 
breviated by obp', mr',x: {s/d)) consists of a sort mapping obp : S ^ S* (mapping 
sorts to object sorts), a mapping mp : S ^ I (assigning a monoid element to every 
sort) and an assignment of channel names Xi to tuples consisting of a sort Si and a 
monoid element Oi. 

We define sort r{xi) =Siandr,y: {t/b) denotes obp', air', x: {s/d),y: (t/b). 

Sorts are used to control the mobility of names. That is if obr(s) = si . . . s„, we 
know that only n -tuples of channel names with sorts s, are sent or received via a channel 
with sort s. If a free name x and a bound name y have the same sort, we have to take into 
account that x may replace y during the reduction. We also use sorts as an intermediate 
level between names and monoid elements, since with a-conversion it is problematic to 
assign monoid elements directly to names. 

Monoid elements appear in two places: in the range of and in the tuples x: (s/a). 
The idea is to sum up the capabilities of x with + in a while x is still free and add a to 
nir{s) with © as soon as x is hidden. We have to use © according to the explanation 
given in section 2.2. The other possibility would be to immediately add the capabili- 
ties to nir{s) with © (without storing them in a first), but since a + 6 < a © 6, this 
would lead to looser bounds. (It would, however, be possible in the case where we only 
consider monoid elements greater than or equal to 0, since in this case + and © always 
coincide.) 

In the rest of this paper we use the operations on type assignments given in ta- 
ble 2 (all operations on sequences are conducted pointwise): in (Add-Mon) we add a 
monoid element a to a type assignment F by adding a to nir{s) (with ©) and leaving 
everything else unchanged. And denotes (. . . . . . )("'»>“»>. 



(Add-Mon) (ob; m; x : (2/3))^“’“^ = ob; m'; x : (s/a) 



where m'(s') 



m(s) © a if s = s* 
m(s) otherwise 



(Sum) (ob] m; x : (s/d)) @ (ob] m' ] x : (s/h)) = ob] m © m' ] x : {s/d + h) 
(Join) (ob] m; X : (s/d)) V 6 = ob] m]x: (3/3 V b) 

(ORD) (ob] m] X : (s/d)) < (ob] m'] X : (s/b)) m<m'andd<b 

(Remove) If T = A,x : (s/a), then T\{x} = A 



Table 2. Operations on type assignments 
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Summation F ® F' (Sum) is defined for type assignments which contain the same 
names (having identical sorts) and which satisfy obp = obp' (i.e. they have the same 
sort structure). In this case m © m' and a + b denote the pointwise summation. The 
summation on type assignments has a counterpart (denoted by the same symbol) in 
Honda’s work [6]. 

In (Join) a pointwise join with every monoid element assigned to a channel name 
and the monoid element b is defined. And finally we need a partial order on type as- 
signments (Ord) and an operation removing an assumption on a name x from a type 
assignment (Remove). 

We are now ready to define the rules of the type system (see table 3). out and in are 
fixed monoid elements (where in must be comparable to 0) representing the capabilities 
of output respectively input prefixes. 



F P, F < A 
A \- P 



(T-<) rvOhO(T-NlL) r p" (T-Par) 

Ti ® ©2 h "i I "2 



F,x: (s/a),z: (i,b) \~ P 

r-: ^ (T-Out) if obr(s) = t 

(P, z : (t,h)) \/ 0, X : {s/{a — in) \/ 0 + out) h x{z).P 



F,x: {s/a),y: (i/h) h P 

(P W 0, X : (s/ (a — out) W 0 + in))^*’’^'^ h x{y).P 



(T-In) if obr{s) = i 



T\{x},x: (s/a) h P 
h {ox:s)P 



(T-Restr) 



t I ("T-REP) 

A,x: {s / a + sig{in)) \x{y).P 

if P ® P < P and P = A, x: (t/a) 



Table 3. Rules of the type system 
The intuitive meaning of the rules is as follows: 

(T-<) We can always over-approximate the capabilities of a process. 

(T-Nil) The nil process can have an arbitrary type assignment, provided the monoid 
elements of the free names are greater than 0. 

(T-Par) The parallel composition of two processes can be typed by adding their re- 
spective type assignments. 

(T-Out) First we subtract in from the monoid element a associated with the subject 
X of the output prefix, since the emergence of P means the removal of an input 
prefix with subject x somewhere else in the environment. We then take the join of 
all monoid elements and 0, since we only consider capabilities on the outer level of 
processes and thus we only consider future influence by positive capabilities, but 
not by negative ones (since we are doing over-approximation). In the end out is 
added to the monoid element associated with x. 

Furthermore we have to check that the sort structure is correct, i.e. since zi, . . . , Zn 
are communicated via x, the string of their sorts must be the object sort of the sort 
of X. 
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(T-In) As described for (T-Out), we subtract out, take the join with 0 and then add 
in. Furthermore we check the correctness of the sort structure as above. 

Since, in this case, t/i , . . . , t/n are bound by the input prehx, all assumptions on 
these names are removed from the type assignment, and their monoid elements are 
added to the rest of the type assignment with ©. 

(T-Restr) If a name is hidden, we remove the assumption on it, hut retain information 
on its capabilities by adding its monoid element to the type assignment and hy 
keeping the sort. 

(T-Rep) In this rule we have to make sure that a replicated process has a type assign- 
ment which is either idempotent or gets smaller when added to itself. This can be 
achieved if F contains only negative or idempotent monoid elements. 

And furthermore, since we know that inhnitely many copies of the input prehx with 
subject X are available, we add _L, T or 0, according to the value of in. 

The type system satishes the following substitution lemma, which is central for 
proving the subject reduction property: 

Lemma 1. (Substitution) Let x ^ y be two names. 

If r, X : {s/a),y: (s/b) h P, then F, x : {s/a + b) h P{x/y}. 

Remark: the proofs can be found in the extended version of this paper [11]. 

The types dehned in table 3 are not yet invariant under reduction: rather than F , a 
modihed type assignment F satishes the subject reduction property. 

Let T = oh; m; (s/a) and dehne T = (oh; m; £ : (s/0))^®’“^ That is we add 
all monoid elements of the remaining free names to m with ©. Constructing P directly 
during the typing process does not seem to be possible, since we hrst have to sum up 
monoid elements with + and then add them to the type tree with © the moment they 
are hidden. 

Proposition 1. (Subject Reduction Property) If P = Q and F \~ P. then it holds 
also that F \~ Q. And if P — ?> P' and P \~ P then there exists a type assignment P' 
such that P' h P' and F' < P. 



4 Using the Type System for Process Analysis 

As in other type systems for mobile processes, a type guarantees absence of runtime 
errors which may appear in the form of arity mismatches in the communication rules 
(R-COMM) and (R-Rep), but it also enables us to perform more detailed process anal- 
ysis. 

4.1 Process Capabilities 

The aim of this paper is to construct type systems yielding useful results for the analysis 
and verification of parallel processes. In our case the generic type system gives infor- 
mation concerning structural properties of a process, especially concerning its input and 
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output capabilities. We will now formally define the connection between the type of a 
process and its capabilities. 

Let P be a process and let * be a free name occurring in P. We define P’s capability 
wrt. X by adding the following monoid elements: for every use of * as an output port 
we add out and for every use of * as an input port we add in. Notice that we do not 
continue summation after prefixes (see table 4). 



a(o) = o a(p I Q) = a(p) + a(Q) 



c.(m-P) 



{ out if X = z 
0 otherwise 



CMy)-P) 



{ in if X = z 
0 otherwise 



CA'Mi)-P) 



sig(in) if X = z 
0 otherwise 



Cx((vy. s)P) 



Cx(P) ifx^y 
0 otherwise 



Table 4. Determining the capabilities of a process 

Proposition 2. If F \~ P , P ^ P' and x is a free name of P it follows that Cx (P') < 
mjT{sortr{x)), i.e. we determine the sort of x and lookup the corresponding monoid el- 
ement in r. And if P contains a subexpression [vy : s')Q it follows that the capabilities 
ofy will never exceed m-p(s'). 

4.2 Type Inference 

In order to support our claim that the type system is useful for the automated analysis 
of processes, we roughly sketch a type inference algorithm, determining the smallest 
type (in the < relation defined in section 3) of a process P, provided P has a type. In 
order to make sure that a smallest type exists, we impose the following condition on 
the 1-monoid: for every monoid element a £ / there is a smallest element a' such that 
a <a' and a' + a' < a' (the same must be true for the operation ©)^ 

The algorithm proceeds in two steps: 

- In the first step we determine the assignment of sorts to names and the mapping 
obr- This may be done by representing ob as a graph and refining ob step by step by 
collapsing graph nodes every time we encounter a constraint of the form ob(s) = s. 
Or we can use the sort inference algorithm by Simon Gay [5]. 

- In the second step we compute the monoid elements by induction on the structure of 
P. In this case the typing rules are already very constructive, the main complication 
arises from typing rule (T-Rep). Here we require that the monoid I satisfies the 
condition stated above. So (because of rule (T-<)) we may replace every monoid 
element a with its corresponding a' in the type assignment that we have derived so 
far. 

A straightforward implementation of the algorithm has a runtime complexity quadra- 
tic in the size of P. Ameliorations are certainly possible by using efficient algorithms for 
unification and by finding an intelligent strategy for computing the monoid elements. 

* Every 1-monoid useful for process analysis that we have come across so far satisfies this con- 
dition. In the case of ©°°, a' is oo for positive a and a itself for all other elements. 
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5 Examples 

We now get back to the two example processes P and P' introduced in section 2.1 
and type them with several instantiations of our type system, and thereby show how to 
mechanise process analysis in these cases. 

We use the algorithm presented in section 4.2 to derive a type 
assignment P for P and P' and in the first step obtain a sort 
structure obp as shown in the figure to the left {obr is the 
same for P and P')- If there is an arrow labelled obi from 
sort s to sort t, then t is the i-th element of the sequence 
obr(s). The assignment of names (in brackets we give the 
bound names) to sorts is: 

c: Sc {d: Sd) h,ii{,r,a,s,hi): si i 2 {,h 2 )- S 2 

In the second step the monoid elements mjr[s) are computed (see below) in order 
to give an upper bound for all names having sort s. 

5.1 Input/Output Behaviour of Channels 

One simple application of our type system is to check whether channels are used for 
input, output or for both. We use the monoid 10 (with elements none, (^-“output only”, 
/-“input only” and both) introduced in section 2.2. We set in = I, out = O. 

For both processes P and P' we obtain the same type assignments with monoid 
elements shown in table 5 (row 1), i.e. t 2 , /j 2 are used neither for input nor output while 
all other names may be used for both. Note that, because of residuation, typing F alone 
yields capability I for name c, but no output capability, c acquires output capability only 
if communication with the environment is taking place. 

This type system is similar to the one in [15] (apart from the fact that we consider 
types as a representation of process capabilities, rather than constraints on the environ- 
ment), our type system however lacks a concept of co- and contravariance and thus our 
bounds are less tight. 

5.2 Upper Bounds on the Number of Active Channels 

We attempt to define a type system, similar to the one presented in [8] for our frame- 
work, i.e. we want to check how often a channel is used either for input or output. 

We use the 1-monoid x (cartesian product of the set of integers with c» and 

— c») introduced in section 2.2. The first component represents the number of active 
output prefixes (with a fixed subject) and the second component represents the number 
of active input prefixes. 

We set out = (1, 0), in = (0, 1), and typing the processes P and P' yields the 
results given in table 5 (rows 2 & 3). Since for P the upper bound is always (1, 1) or 
smaller we can conclude that there is at most one active input port and one active output 
port for any given subject at a time. For P' we can guarantee that, e.g. c always occurs 
at most once as an output prefix, although it occurs under a replication (see monoid 
element m-p-(sc)). 
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Property to be checked 


m—{sd) 


m—{sc) 


m—{si) 


m— (52) 


T 


Input/Output behaviour of P and P' 


both 


both 


both 


none 


2 


Upper bounds on active channels in P 


(1,1) 


(1,1) 


(1,1) 


(0,0) 


3 


Upper bounds on active channels in P' 


{00, (X)) 


(1, 00 ) 


(1, 00 ) 


(0,0) 


4 


Lower bounds on active channels in P 


(-1,0) 


(-1,0) 


(-1,-1) 


(0,0) 


5 


Lower bounds on active channels in P' 


{ — 00, (X)) 


{ — 00, (X)) 


(- 00 , -1) 


(0,0) 


6 


Avoiding blocked output prefixes in P' 


{00, (X)) 


(1, 00 ) 


(1,-1) 


(0,0) 



Table 5. Resulting monoid elements for different instantiations of the generic type system 

5.3 Confluence 

As in [8] we can use upper bounds on the number of active channels to guarantee 
confluence for tr-calculus processes (see also [14]). Let Q be a process, and for every 
name x in Q which is either free or bound by the scope operator i' it holds that its 
capabilities never exceed (1, 1). Then we can guarantee that every channel (also bound 
channels) occurs at most once at any given time as active input and output prefix, and we 
have non-overlapping redexes in (R-COMM). Thus we can conclude that if Q -^* Q' , 
Q' Qi and Q' Q2, then either Qi = Q2 or there is a process Q3 such that 
Qi Qs and Q2 — ?■ Q3. 

Row 2 in table 5 provides upper bound (1,1) for all capabilities in P. So we can 
state that P is confluent. Note that the same process would not be recognised as conflu- 
ent by the type system in [8]. 



5.4 Lower Bounds on the Number of Active Channels 

The type system is not limited to statements of the form: “there at most n active chan- 
nels”, we can also guarantee that there are at least m active channels. In order to achieve 
this, we use the type system above and just invert the partial order, i.e. we take > instead 
of <, out and in remain unchanged. This means also that the join V in the new partial 
order is now the meet A of the original partial order. Typing P does not give us much 
information, since we cannot guarantee that there are at least m > 0 prefixes active at 
any given time (see table 5, row 4) for any channel. In fact, some lower bounds are even 
(— 1) stating that the respective channel removes input (or output) prefixes instead of 
making them available. In this case P — ?>* 0 which means that no lower bounds can be 
guaranteed. 

Typing P' yields the monoid elements given in table 5 (row 5) which states that 
input prefixes with subjects c, d are available infinitely often. 

5.5 Avoiding Blocked Prefixes 

Another interesting feature is to avoid blocked prefixes, i.e. prefixes which are wait- 
ing for a non-existing communication partner. We will first define — with the help of a 
lattice-ordered monoid — what it means for an output prefix to be blocked. 

We take x as an 1-monoid and define a new partial order: (i, j) C (i', /) iff 
i < i' and j > j' . The first component represents the number of output prefixes and fhe 
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second the number of input prefixes of the same subject, out = (1, 0) and in = (0, 1). 
We say a name x is blocking in P, if P P' , Cx(P') □ (1, 0) (i.e. there is at least 
one output prefix with subject x and no corresponding input prefixes) and for all P" 
with P' -^* P” it follows that Cx{P”) □ (1, 0) (no communication with x will ever 
take place). 

We can, e.g., avoid this situation, by demanding that it is always the case that 
Cx{P') = (a,b) and either a < 0 or 6 > 1 (i.e. (a, 6) ^ (1)0))- We take the 1- 
monoid and out, in introduced above. This type system can be obtained by composing 
a type system establishing upper bounds for input prefixes and one establishing lower 
bounds for output prefixes. In this way we find out that all output prefixes with subjects 
c and d are non-blocking in P'. 

6 Conclusion and Future Work 

This work has a similar aim as that of Honda [6], in that it attempts to describe a gen- 
eral framework for process analysis using type systems. We concentrate on a more 
specialised but still generic type system, which enables us to prove the subject reduc- 
tion property for the general case. We have shown that, despite its generality, the type 
system can be instantiated in order to yield type systems related to existing ones. We 
have also shown how to parameterise type systems and what kind of parameters are 
feasible (in our case an 1-monoid). 

Another type system that has close connections to ours is the linear type system 
by Kobayashi, Pierce and Turner [8], since it also involves the typing of input/output 
capabilities of processes. Apart from the more general approach, one new feature of our 
type system is the introduction of residuation which allows us to recognise the process 
P in section 5 as confluent, in contrast to the type system in [8]. In some other cases 
however, our bounds are less tight. The central aim of [8] is to introduce a new notion of 
barbed congruence by reducing the possible contexts of a process. This question has not 
been addressed in this paper, it is an interesting direction for future work. For a more 
detailed discussion of the relation between the two type systems see the full version of 
this paper [11]. 

Our type system was derived from a type system for a graph-based process calculus 
with graphs as types, which make it easier to add additional behaviour information and 
which have a clear correspondence to associated monoid elements (via morphisms and 
categorical functors) [9]. A graph-based type system with lattices instead of monoids 
was presented in [10]. For lattices or positive cones of 1-monoids, generic type systems 
are much easier to present. The main complication arises from non-positive elements 
and residuation. 

Inspiration for this work came from papers deriving information on the behaviour 
of a process by inspecting its input/output capabilities, such as [15, 14, 8]. In order to 
conduct process analysis concerning more complex properties (as was done e.g. in [7, 
4]) it is necessary to use type systems assigning behaviour information (i.e. monoid 
elements in our case) not only to single channels, but rather to tuples of channels or 
other more complex structures. This normally results in a semi-additive type system, in 
the terminology of Honda [6] , while our present type system is strictly additive. In order 
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to extend this type system, a first solution would be to allow monoid labels for n-ary 
tuples of names. Another idea is to integrate it into the categorical framework presented 
in [10], which would allow us to specify very general behaviour descriptions. 

We believe that generic type systems can be developed into tools suitable for fast 
debugging and the analysis of concurrent programs. The next step is to apply the type 
system presented here to “real-life examples” and to more realistic programming lan- 
guages. 

Acknowledgements: I would like to thank the anonymous referees for their helpful 
comments, especially for the suggestion to use a sort system instead of type trees. 
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Abstract. We propose an extension of the asynchronous 7r-calculus in 
which a variety of security properties may be captured using types. These 
are an extension of the Input / Output types for the 7r-calculus in which 
I/O capabilities are assigned specific security levels. 

We define a typing system which ensures that processes running at 
security level a cannot access resources with a security level higher than 
(T. The notion of access control guaranteed by this system is formalized 
in terms of a Type Safety theorem. 

We then show that, for a certain class of processes, our system pro- 
hibits implicit information flow from high-level to low-level processes. 

We prove that low-level behaviour can not be influenced by changes to 
high-level behaviour. This is formalized as a Non-Interference Theorem 
with respect to may testing. 

1 Introduction 

The problem of protecting information and resources in systems with multi- 
ple sensitivity or security levels, [2], has been studied extensively. Flow analysis 
techniques have been used in [3,4], axiomatic logic in [13] while in [27, 15] type 
systems have been developed for a number of prototypical programming lan- 
guages. In this paper, we explore the extent to which type systems for ensuring 
various forms of security can also be developed for the asynchronous 7r-calculus 
[5,17]. We discuss two security issues: resource access control and information 
control. The former is described in terms of runtime errors, the latter in terms 
of non-interference [27, 11]. 

The (asynchronous) 7r-calculus is a very expressive language for describing 
distributed systems, [5, 23, 12], in which processes intercommunicate using chan- 
nels. Thus n?{x) P is a process which receives some value on the channel named 
n, binds it to the variable x and executes the code P. Corresponding to this 
input command is the asynchronous output command nl{v) which outputs the 
value V on n. The set of values which may be transmitted on channels includes 
channel names themselves; this, together with the ability to dynamically create 
new channel names, gives the language its descriptive power. 
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Within the setting of the 7r-calculus we wish to investigate the use of types to 
enforce security policies. To facilitate the discussion we extend the syntax with 
a new construct to represent a process running at a given security clearance, 
(t|P]. Here a is some security level taken from a complete lattice of security 
levels SL and P is the code of the process. Further, we associate with each 
channel, the resources in our language, a set of input/output capabilities [22, 
24], each decorated with a specific security level. Intuitively, if channel n has 
a read capability at level <t, then only processes running at security level a or 
higher may be read from n. This leads to the notion of a security policy S, which 
associates a set of capabilities with each channel in the system. The question 
then is to design a typing system which ensures that processes do not violate 
the given security policy. 

Of course this depends on when we consider such a violation to take place. 
For example if S assigns the channel or resource n the highest security level top 
then it is reasonable to say that a violation will eventually occur in 

c!(n) I bot|c?(a;) a;?(y) P] 

as after the communication on c, a low level process, bot|n?(y)P] has gained 
access to the high level resource n. Underlying this example is the principle that 
processes at a given security level a should have access to resources at security 
level at most a. We formalize this principle in terms of a relation P err, 
indicating that P violates the security policy S. 

To prevent such errors, we restrict attention to security policies that are 
somehow consistent. Let P be such a consistent policy; consistency is defined by 
restricting types so that they respect a subtyping relation. We then introduce a 
typing system, P \- P, which ensures that P can never violate P : 

If P \- P then for every context C[ ] such that P h C[P] and every Q 
which occurs during the execution of C[P], that is C[P] Q, we have 
Q err. 

Thus our typing system ensures that low level processes will never gain access to 
high level resources. The typing system implements a particular view of security, 
which we refer to as the R- security policy, as it offers protection to resources. Here 
communication is allowed between high level and low level principals, provided 
of course that the values involved are appropriate. 

This policy does not rule out the possibility of information leaking indirectly 
from high security to low security principals. Suppose /i is a high channel and hi 
is a channel with high-level write access and low-level read access in: 

top|/i?(a;) if a; = 0 then hl!(0) else hl!(l)| | bot| hl?( 2 :) (3| (★) 

This system can be well-typed although there is some implicit information flow 
from the high security agent to the low security one; the value received on the 
high level channel h can be determined by the low level process Q. 

It is difficult to formalize exactly what is meant by implicit information flow 
and in the literature various authors have instead relied on non-interference, [14, 
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25,11,26], a concept more amenable to formalization, which ensures, at least 
informally, the absence of implicit information flow. 

To obtain such results for the 7r-calculus we need, as the above example 
shows, a stricter security policy, which we refer to as the I-security policy. This 
allows a high level principal to read from low level resources but not to write to 
them. Using the terminology of [2, 7]: 

— write up: a process at level a may only write to channels at level a or above 

— read down: a process at level a may only read from channels at level a or 

below. 

In fact the type inference system remains the same and we only need constrain 
the notion of type. In this restricted type system well-typing, F \\- P, ensures a 
form of non-interference. 

To formalize this non-interference result we need to develop a notion of pro- 
cess behaviour, relative to a given security level. Since the behaviour of processes 
also depends on the type environment in which they operate we need to define 
a relation 

which intuitively states that, relative to P, there is no observable distinction 
between the behaviour of P and Q at security level <t; processes running at 
security level a can observe no difference in the behaviour of P and Q. Lack of 
information flow from high to low security levels now means that this relation 
is invariant under changes in high-level values; or indeed under changes in high- 
level behaviour. 

It turns out that the extent to which this is true depends on the exact for- 
mulation of the behavioural equivalence ssjl. We show that it is not true if ssjl is 
based on observational equivalence [19] or must testing equivalence [21]. But a 
result can be established if we restrict our attention to may testing equivalence 
(here written ~/^). Specifically we will show that, for certain P[,K: 

If P\P P,Q and P IPP H, K then P Q implies P\H Q\K 

The remainder of the paper is organized as follows. In the next section we 
define the security ir-calculus, giving a labelled transition semantics and a formal 
definition of runtime errors. In Section 3 we design a set of types and a typing 
system which implements the resource control policy. This section also contains 
Subject Reduction and Type Safety theorems. In Section 4 we motivate the 
restrictions required on types and terms in order to implement the information 
control policy. We also give a precise statement of our non-interference result, 
and give counter-examples to related conjectures based on equivalences other 
than may testing. 

The proof of our main theorem depends on an analysis of may testing in 
terms of asynchronous sequences of actions [6] which in turn depends on a more 
explicit operational semantics for our language, where actions are paramterised 
relative to a typing environment. The details may be found in the full version of 
the paper, [16]. 
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Fig. 1 Syntax 



P,Q :: = 


Terms 


X,y :: 




Patterns 


u\(v) 


Output 


X 




Variable 


ii?(X : A) P 


Input 


(Vi, 


. . . , W) 


Tuple 


\f u = V then P else Q 


Matching 








<^IPJ 


Security level 


tl, W 




Values 


(new a - A) P 


Name creation 


bv 




Base Value 


P\Q 


Composition 


a 




Name 


*p 


Replication 


X 




Variable 


0 


Termination 


(lii. 


...,Uk) 


Tuple 



2 The Language 

The syntax of the security ir-calculus, given in Figure 1, uses a predefined set of 
names, ranged over by a,6, . . . ,n and a set of variables, ranged over by x,y,z. 
Identifiers are either variables or names. Security annotations, ranged over by 
small Greek letters a,p,. . . , are taken from a complete lattice {SL, FI, U, top, bot) 
of security levels. We also assume for each a a set of basic values BV^', we use 
bv to range over base values. We require that all syntactic sets be disjoint. 

The binding constructs ul{X : A) Q and (new a: A) Q introduce the usual 
notions of free names and variables, fn(P) and fv(f’), respectively, and associ- 
ated notions of substitution; details may be found in the full version. Moreover 
the typing annotations on the binding constructs, which will be explained in 
Section 3, are omitted whenever they do not play a role. 

The behaviour of a process is determined by the interactions in which it can 
engage. To define these, we give a labelled transition semantics (LTS) for the 
language. The set Act of labels, or actions, is defined as follows: 






(c:C)a?u 
(c : C)a!u 



Actions 

Internal action 

Input of u on a learning private names c 
Output of u on a revealing private names c 



Visible actions (all except r) are ranged over hy a, fi and we use S{a) to denote 
the bound names in a, together with their types. S((c'- C)a!u) = £{{c'- C)a?u) = 
(c:C). Further, let n{p) be the set of names occurring in p, whether free or 
bound. We say that the actions ‘(c:C)a?u’ and ‘(c:C)a!u’ are complementary, 
with a denoting the complement of a. 

The LTS is defined in Figure 2 and for the most part the rules are straight- 
forward; it is based on the standard operational semantics from [20], to which 
the reader is referred for more motivation. 

Informally a security policy associates with each input /output capability on 
a channel a security level. To this end, Pre- capabilities and pre-types are defined 
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Fig. 2 Labelled Transition Semantics 



(l-out) 



(L-IN) 



a!(«) ^ 0 a?{X) P Vx[} 

(l-open) 



c ^ fn(P),c 6 fn(w) 



(c : C)a!f\ 


b ^ a 


(new6:B) P ^ B)(g: c)a!»^ p/ b € fn{v) 


(l-com) 




P ^ P', Q^Q' 




P\Q ^{newS{a)) {P'\Q') 


(l-eq) 




if u = u then P else Q 


P if u = 


(l-ctxt) 




P P' 


P P' 


*P *P\ P' 


P\Q-^P'\Q 


a[Pl ^ alP'j 


Q\P^Q \P' 



u ^ w 



bn(/r) ^ fn(Q) 



P P' 

(new a: A) P-^ (new a: A) P' 



a ^ n(/r) 



as follows: 



cap : : = 

w^{A) 

i'a(A) 

A ::= 

{capi, . . . , capk} 
(Al, . . . , A^;) 



Pre- Capability 

(T-level process can write values with type A 
(T-level process can read values with type A 

Pre- Type 
Base type 

Resource type {k > 0) 

Tuple type {k > 0) 



We will tend to abbreviate a singleton set of capabilities, {cap}, to cap. 

A seeurity poliey, E, is a finite mapping from names to pre-types. Thus, for 
example, if S maps the channel Ih to the pre-type jwbot(B), rtop(A)}, for some 
appropriate A, B, then low level processes may write to Ih but only high level 
ones may read from it; this is an approximation of the security associated with 
a mailbox. On the other hand if S maps hi to jrbot(A), Wtop(B)} then hi acts 
more like an information channel; anybody can read from it but only high level 
processes may place information there. 

The import of a security policy may be underlined by defining what it means 
to violate it. Our definition is given in Figure 3, in terms of a relation P err. 
For example, relative to the policy S defined above, after one reduction step 
of the process top|c!(hl)] | bot|c?(a;) a;!(u)], there is a security error because 
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Fig. 3 Runtime Errors 

(e-rd) p[[a?(X) P| 1-^ err if o’ p implies for all A, ro-(A) ^ P(a) 

(e-'WRi) p|a!(w)| err if o’ P implies for all A, Wo-(A) ^ P(a) 

(e-'WR2) p|a!(w)| err if bv € v, bv € Bo- and o’ 2^ p 

, , P err P err P = Q, P err 

E p I Q gff p|P] err Q 1-^ err 

P err 

(new n : A) P 1-^ err 



bot|hl!(u)] err. A lo-w security process has read access to security chan- 
nel hi on -which write access is reserved for high-security processes. Assuming 
an appropriate typing for c and v the same security error does not occur in 
top|c!(lh)] I bot|c?(a;) a;!(u)]. The low security process bot|lh!(u) Q] has the right 
to write on the channel Ih. 



3 Resource Control 



Our typing system will apply only to certain security policies, those in which the 
pre-types are in some sense consistent. Consistency is imposed using a system 
of kinds: the kind RType^ comprises the value types accessible to processes at 
security level a. These kinds are in turn defined using a sub typing relation on 
pre-capabilities and pre-types. 

Definition 1. Let <• he the least preorder on pre- capabilities and pre-types such 
that: 



(u-wr) w<j{A) <■■ 

(u-rd) r,,(A) <: rp(B) 

(u-base) Bct <: Bp 

(u-REs) {capj}jg/ <: {capfjjgj 

(u-TUP) (Ai , . . . , A*) <: (Bi , . . . , Bfc) 



*/ B <: A 

if A <: B and a < p 
if a < p 

if (Vj)(3i) cap* <: capf 
if (Vi) Ai <: 



For each p, let RType^ be the least set that satisfies: 



(rt-wr) 



A e RType^, 



{w£,(A)} e RTypep 

(rt-rd) 


A e RType^, 


(rt-base) 


{r,,(A)} e RTypep 


(T P 


(rt-wrrd) 


G RTypep 


A e RType,, ^ ^ ^ 


(rt-tup) 


A' e RType,,- a' < p 


Ai G RTypep (Vi) 


{wa{A), r,,-(A')} e RTypep A c A' 


(Ai,... ,Au)£ RType 



P 
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Fig. 4 Typing Rules 


(t-id) 


(t-base) 


(t-tup) 


P(ii) <: A 


bv 6 Bo- 


P F : Ai (Vi) 


PFii:A 


PF bv:B^ 


P F (wi, . . . ,«*) :(Ai, . . . ,Ak) 






(t-eq) 


(t-in) 


(t-out) 


P F li : A, w : B 


r, X : A F P 


r \- u'-'Nc (A) 


PF g 


rhu:r^{A) 


PFw:A 


Pn A} F P 


PF ii?(X: A)P 


PF ii!(w) 


P F if u = V then P else Q 


(t-sr) 


(t-new) 


(t-str) 


p [o-np p 


r,a -. A 'f P 


PF p, g 


r F p[Pi 


r F (new a : A) P 


P F P 1 g, *P, 0 



Let RType he the union of the kinds RType^ over all p. □ 

Note that ii a < p then RType^ C RType p. Intuitively, low level values are 
accessible to high level processes. However the converse is not true. For example 
WtopO G RTypetop but Wtop() is not in RTypebot- The compatibility requirement 
between read and write capabilities in a type (rt-wrrd), in addition to the 
typing implications discussed in [24], also has security implications. For example 
suppose rbot(BCT) and Wtop(B) are capabilities in a valid channel type. Then 
apriori a high level process can write to the channel while a low level process 
may read from it. However the only possibility for a is bot, that is only low level 
values may be read. Moreover the requirement B <: implies that B must also 

be Bbot- So although high level processes may write to the channel they may 
only write low level values. 

Proposition 1. For every p, RType^ is a preorder with respeet to <■ , with both 
a partial meet operation □ and a partial join U. □ 

A type environment is a finite mapping from identifiers (names and variables) 
to types. We adopt some standard notation. For example, let ‘T, u : A’ denote the 
obvious extension of T; ‘T, u : A’ is only defined if u is not in the domain of F. The 
subtyping relation <: together with the partial operators □ and U may also be 
extended to environments. We will normally abbreviate the simple environment 
{u : A} to u'.A and moreover use u : A to denote its obvious generalisation to 
values. 

The typing system is given in Figure 4 where the judgements are of the form 
‘F F P’. If T F P we say that P is a a -level proeess. Also, let ‘P F P’ abbreviate 

ij-< |10p 

Intuitively ‘P F P’ indicates that the process P will not cause any security 
errors if executed with security clearance a. The rules are very similar to those 
used in papers such as [24,22] for the standard 10 typing of the 7r-calculus. 
Indeed the only significant use of the security levels is in the (t-in) and (t-OUt) 
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rules, where the channels are required to have a specific security level. This is 
inferred using auxiliary value judgements, of the form T h u : A. It is interesting 
to note that security levels play no direct role in their derivation. 

Theorem 1 (Subject Reduction). Suppose T F P. Then 

— P —^_Q implies P F Q 

_ p (c-.G)a'!v^ Q tfigfg exists a type A sueh that P F a : r^(A) for some 

S < (T,_ and if r r\v- A is well-defined then P □ u : A F Q. 

— P Q implies there exists a type A sueh that P F a : w^(A) for some 

S S a, P,c'.C V '■ A and P,c'.C F Q. 

□ 



We can now state the first main result: 

Theorem 2 (Type Safety). If P \~ P then for every elosed eontext C[] sueh 
that P F C[P] and every Q sueh that C[P] — Q we have Q err 

□ 

Having defined our typing system we may now view <t|P] simply as notation 
for the fact that, relative to the current typing environment P, the process P is 
well-typed at level <t, i.e. P F P. Technically we can view <t|P] to be strueturally 
equivalent to P, assuming we are working in an environment P such that P F P. 

4 Information Flow 

We have shown in the previous sections that, in well-typed systems, processes 
running at a given security level can only access resources appropriate to that 
level. However, as pointed out in the Introduction this does not rule out (im- 
plicit) information flow between levels. One way of formalizing this notion of 
flow of information is to consider the behaviour of processes and how it can be 
influenced. If the behaviour of low-level processes is independent of any high- 
level values in its environment then we can say that there can be no implicit flow 
of information from high-level to low-level. This is not the case in the example 
considered in the Introduction, (★). Suppose, for example, that Q is the code 
fragment ‘if 2: = 0 then /i!() else If (★) were placed in an environment with 
‘top|/i!(0)]’, then the resource li would be called. If, instead, (★) were placed in 
an environment with ‘top|/i!(42)]’, then I2 would be called. In other words the 
behaviour of the low-level process can be influenced by high-level changes; there 
is a possibility of information flow downwards. 

This is not surprising in view of the type associated with the channel hi; 
in the terminology of [ 2 ] it allows a write down from a high-level process to a 
low-level process. Thus if we are to eliminate implicit information flow between 
levels in well- typed processes we need to restrict further the allowed types; types 
such as {wtopO, rbotO) clearly contradict the spirit of secrecy. Thus, for the rest 
of the paper we work with the more restrictive set IType, the Information types. 
In order for {wct(A), rCT'(A')} to be in IType, it must be that a < <t'; this is not 
necessarily true for types in RType. 
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Definition 2. For each p, let IType^, be the least set that satisfies the rules in 
Definition 1, with (rt-wrrd) replaced by: 



(it-wrrd) 

A e IType^ a < a' 

A' € IType^,- cr' ^ p 

{w,(A),r,-(A')}GlType, A c A' 

Let IType be the union of IType^ over all p. We write F \ f P if F P can be 
derived from the rules of Figure 4 using these more restrictive types. □ 

All of the results of the previous section carry over to the stronger typing system; 
we leave their elaboration to the reader. 

Unfortunately, due to the expressiveness of our language, the use of I-types 
still does not preclude information flow downwards, between levels. Consider the 
system 



top|/i?(a;) if a; = 0 then botp!(0)] else bot|d(l)]| | bot|/?( 2 ;) (3| 

executing in an environment in which /i is a top-level read/ write channel and I 
is a bot-level read/write channel. This system can be well-typed using I-types, 
but there still appears to be some some implicit flow of information from top 
to bot. The problem here is that our syntax allows a high-level process, which 
can not write to low-level channels, to evolve into a low-level process which does 
have this capability; we need to place a boundary between low- and high-level 
processes which ensures a high-level process never gains write access to low-level 
channels. This is the aim of the following definition: 

Definition 3. Define the security levels of a term below p, s\p(P), as follows: 

s\p(*P) = s\p(P) slp(O) = {p} slp(cr|P]) = {cr n p} U s\anp(P) 
slpifnewa-.A) P) =s\p(P) s\p(ul{v)) = $ s\p(P \ Q) = s\p(P) U s\p(Q) 
s\p{uI{X : B) P) = s\p{P) slp(if u = v then P else Q) = s\p{P) U s\p{Q) 

A process P is a-free if for every p in sltop(P), P 2^ o'- □ 

Non-interference, as discussed in the Introduction, (★★), depends on a formu- 
lation of a behavioural equivalence, as the following example illustrates. Let A 
denote the type {wbotO, I'botO) and B denote {rbotO}- Further, let P map a and 
6 to A and B, respectively, and n to the type {wbot(A), rbot(A)}. Now consider 
the terms P and H defined by 

P <t= bot|n!(a) I n?(a; : A) a;!()] P 4= top|n?(a; : B) 6?(y)0] 

It is very easy to check that P Ih P, P and that P is bot-free. Note that in the 
term P | P there is contention between the low and high-level processes for who 
will receive a value on the channel n. This means that if we were to base the 
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semantic relation on any of strong bisimulation equivalence, weak bisimulation 
equivalence, [19], or must testing, [21], we would have 

P\H 

The essential reason is that the consumption of writes can be detected; the 
reduction 

P I P[ bot|n?(a; : A) a;!()] j top|6?(y) . 0] 

cannot be matched by P j 0. Using the terminology of [21], P j 0 guarantees the 
test bot|a?( x) w!()] whereas P j PI does not. 

May equivalence is defined in terms of tests. A test is a process with an 
occurrence of a new reserved resource name oj. We use T to range over tests, 
with the typing rule P IF w!() for all P. When placed in parallel with a process P, 
a test may interact with P, producing an output on to if some desired behaviour 
of P has been observed. We write TJj. if T T', where T' has the form 
(newc) (w!() j T") for some T" and c; that is T can eventually report success. 

We wish to capture the behaviour of processes at a given level of security. 
Consequently we only compare their ability to pass tests that are well-typed 
at that level. The definition must also take into account the environment in 
which the processes are used, as this determines the security level associated 
with resources. 

Definition 4. We write P Q if for every test T such that P IF T; 

(P j T)Jj. if and only if (Q \ T)Jj. 



□ 



We can now state the main result of the paper. 

Theorem 3 (Non-Interference). If P IF P, Q and P lF°*’ H, K where H and 
K are a-free processes, then P Q implies P \ H Q \ K. □ 

The proof of the theorem relies on a constructing sufficient condition to guarantee 
that two processes are may equivalent. This condition involves the asynchronous 
sequences of actions which processes can perform in the type environment P. The 
details may be found in the full version of the paper, [16], which also contains 
the subsequent proof of the non-interference result. 

Finally let us remark that if we allowed synchronous tests then this result 
would no longer hold. For an appropriate P would have: 

P\0~f^ P\H 

Let T be the test bot|6!() w!()]. Then P j P j T may eventually produce an output 
on u> whereas P | 0 ]T cannot. However, since our language is asynchronous, such 
tests are not allowed. 
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5 Conclusions and Related Work 

Methods for controling information flow are a central research issue in computer 
security [7, 14, 27] and in the Introduction we have indicated a number of dif- 
ferent approaches to its formalisation. Non-interference has emerged as a useful 
concept and is widely used to infer (indirectly) the absence of information flow. 
In publications such as [25,9] it has been pointed out that process algebras may 
be fruitfully used to formalise and investigate this concept; for example in [8] 
process algebra based methods are suggested for investigating security protocols, 
essentially using a formalisation of non-interference for CCS. 

However in these publications the non-interference is always defined be- 
haviourally, as a condition on the possible traces of CCS or CSP processes; useful 
surveys of trace based non-interference may be found in [9,26]. Here, we work 
with the more expressive 7r-calculus, which allows dynamic process creation and 
network reconfiguration. Our approach to non-interference is also more exten- 
sional in that it is expressed in terms of how processes effect their environments, 
relative to a particular behavioural equivalence. However the proof of our main 
result. Theorem 3, describes may equivalence in terms of (typed) traces; pre- 
sumably a trace based definition of non-interference, similar in style to those in 
[9, 26] could be extracted from this proof. 

More importantly our approach differs from much of the recent process calcu- 
lus based security research in that we develop purely static methods for ensuring 
security. Processes are shown to be secure not by demonstrating some property 
of trace sets, using a tool as such as that in [10], but by type-checking. Types 
have also been used in this manner in [1] for an extension of the 7r-calculus called 
the spi-calculus. But there the structure of the types are very straightforward; 
the type Secret representing a secret channel, the type Public representing a 
public one, and Any which could be either. However the main interest is in 
the type rules for the encryption/decryption primitives of the spi-calculus. The 
non-interference result also has a different formulation to ours; it states that 
the behaviour of well-typed processes is invariant, relative to may testing, under 
certain value-substitutions. Intuitively, it means that the encryption/decryption 
primitives preserve values of type Secret from certain kinds of attackers. It would 
be interesting to add these primitives to the our security ir-calculus and to try 
to adapt the associated type rules to the set of I- Types. 

An extension of the 7r-calculus is also considered in [18], where a sophisticated 
type system is used to control information flow. The judgements in their system 
take the form 

Ph, P> A 

where s is a security level, P is a process term, A is a poset of so-called action 
nodes and P is a type environment. Their environments are quite similar to ours, 
essentially associating with channels a version of input /output types annotated 
with, among other things, security levels. However their intuition, and much of 
the technical development, is quite different from ours. In summary it appears 
that our type system addresses information flow within the core 7r-calculus while 
the more sophisticated one of [18] controls the flow allowed via the extra syntactic 
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constructs of their language. However a more thorough comparison between the 
two systems deserves to be made. 
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Abstract. A fundamental goal of biology is to understand how living 
cells work. Recent developments in biotechnology and information pro- 
cessing have revolutionized this research field. Computational biology 
is a major component of this revolution and a fertile source of inter- 
esting problems related to algorithm design, combinatorics, statistics, 
combinatorial optimization, pattern recognition, data mining and com- 
putational learning theory. The speaker will provide an overview of this 
field, describing such areas as genomic mapping and sequencing, sequence 
analysis and analysis of gene expression data. He will then describe how 
his research in this field has called upon his background in theoretical 
computer science but required a shift in his approach to the design and 
development of algorithms. 
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Abstract. We use alternating automata on infinite words to reduce the 
verification of linear temporal logic (ltl) safety properties over infinite- 
state systems to the proof of first-order verification conditions. This 
method generalizes the traditional deductive verification approach of 
providing verification rules for particular classes of formulas, such as 
invariances, nested precedence formulas, etc. It facilitates the deductive 
verification of arbitrary safety properties without the need for explicit 
temporal reasoning. 



1 Introduction 

Temporal logic is a powerful language for specifying properties of reactive sys- 
tems. However, a specification language should be accompanied by verification 
methods to be useful in practice. For finite-state systems model checking pro- 
vides such a verification method: it is (largely) automatic and applicable to 
arbitrary temporal properties. For infinite-state systems the situation is differ- 
ent. Although complete proof systems have been proposed for CTL [GF96] and 
LTL [MP91], the proof system presented for ltl requires the formula to be in 
a canonical form for the rules to be applicable. Transforming a formula into 
canonical form is expensive, the formula may grow exponentially, and worse, it 
may result in a formula that is so different from the original that the user’s intu- 
ition proves useless for constructing the invariants and intermediate assertions 
necessary to complete the proof. 

In this paper we present a verification rule safe that reduces the proof of ltl 
safety formulas to the proof of first-order validities. The rule safe constructs an 
alternating automaton [Var96,Var97] for the formula to be proven. This automa- 
ton may have to be strengthened by the user. First-order verification conditions 
are then generated based on the structure and labeling of this automaton. 

The approach resembles that of verification diagrams [BMS95] and assertion 
graphs [BBM97] , which also reduce the proof of temporal properties to first-order 

* This research was supported in part by the National Science Foundation under 
grant CCR-98-04100 and CCR-99-00984 ARO under grants DAAH04-96-1-0122 and 
DAAG55-98-1-0471, ARO under MURI grant DAAH04-96-1-0341, by Army contract 
DABT63-96-C-0096 (DARPA), and by Air Force contract F33615-99-C-3014. 
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verification conditions. In principle verification diagrams can be generated auto- 
matically (apart from the strengthening and refinement necessary for fairness), 
using an algorithm similar to the tableau construction for ltl, thus reducing a 
temporal property to first-order verification conditions. However, verification di- 
agrams are based on (nondetermistic) w-automata, and the size of the resulting 
diagram can, in the worst case, be exponential in the size of the formula, giving 
rise to a number of first-order verification conditions of the same order of mag- 
nitude, which clearly is undesirable. Alternating automata have an advantage 
over the regular w-automata that they are linear in the size of the formula, thus 
making the number of verification conditions generated also proportional to it. 

The remainder of the paper is organized as follows. Section 2 provides the 
preliminaries: it presents our computational model of transition systems, and 
specification language of linear temporal logic (ltl) . Alternating automata and 
their models are introduced in Section 3. In Section 4 we give an algorithm 
to construct an alternating automaton for future ltl formulas, and we prove 
that the language accepted by the constructed automaton is precisely the set of 
sequences of states that satisfy the formula. Section 5 proposes the verification 
rules B-SAFE and safe that reduce the verification of future safety formulas to 
first-order verification conditions, and it is shown that the special verification 
rules of [MP95] are subsumed by this rule. In Section 6 we give an algorithm to 
construct an alternating automaton for ltl formulas involving past operators, 
and we propose a verification rule for such formulas. Finally, in Section 7 we 
discuss some of the limitations of these rules and give some ideas on how they 
could be overcome. 

2 Preliminaries 

2.1 Computational Model: Fair Transition Systems 

The computational model used for reactive systems is that of a transition system 
[MP95] (ts), S = {V, &s,T), where F is a finite set of variables, 0^ is an initial 
condition, and T is a finite set of transitions. A state s is an interpretation of V, 
and S denotes the set of all states. A transition r € T is a function r : A’ ha 2^, 
and each state in r(s) is called a r-successor of s. We say that a transition r 
is enabled on s if r(s) ^ 0, otherwise r is disabled on s. Each transition r is 
represented by a transition relation Pr(s,s'), an assertion that expresses the 
relation between the values of E in s and the values of V (referred to by V) in 
any of its r-successors s' . 

A run of S is an infinite sequence of states such that the first state satisfies 
&s and any two consecutive states satisfy a Pr for some t £ T- A state s is 
called S-aeeessible if it appears in some run of S. The set of all runs of S is 
denoted by C{S). 

2.2 Specification Language: Linear Temporal Logic 

The specification language studied in this paper is linear temporal logie. We 
assume an underlying assertion language which is a first-order language over 
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interpreted symbols for expressing functions and relations over some concrete 
domains. We refer to a formula in the assertion language as a state formula or 
assertion. A temporal formula is constructed out of state formulas to which we 
apply the boolean connectives and the temporal operators shown below. 

Temporal formulas are interpreted over a model, which is an infinite sequence 
of states a : so,si, . . .. Given a model a, a state formula p and temporal formulas 
(f and xp, we present an inductive definition for the notion of a formula holding 
at a position j > 0 in a, denoted by (<t, j) N (p. 

For a state formula: 

(<T,j) Np iff Sj Np, that is, p holds on state Sj. 

For the boolean connectives: 

(a,j) iff {a,j)i=(p&nd{a,j)\='tl> 

(a,j) iff {a,j)i=(poT{a,j)\='tp 

{a,j)\=^(fi iff 



For the future temporal operators: 



(<7,i) t= 


iff 


(cr, j + l))^p 


{o,j) ¥Up 


iff 


(a, i) p for all i > j 


(<7,j) 1= 


iff 


(cr, i) ^ p for some i > j 


(cr,j) ^pUlp 


iff 


(cr, k) Ip for some k > j, 
and (cr,i) N p for every i, j < i < k 


(a,j) t^pWip 


iff 


(a,j) \=pU-tp or (a,j) ¥\3p ■ 


For the past temporal operators: 


(<7,j) 


iff 


j > 0 and (a,j - 1) 1= 


(ff,j) 1= ©<P 


iff 


j = 0 or (a,j -l))^p 


(ct,j) ^Hp 


iff 


(cr, i) p for all 0 < i < j 


(<7,j) 


iff 


(cr, i) \=p for some 0 <i < j 


(a,j) i=pS'tp 


iff 


(a, k) ^ Ip for some k < j, 
and (a, i) ^ p for every i, k < i < j 


(a,j) \=pBtp 


iff 


(a, j) ^pS-tp OT (a, j)¥E\p ■ 


An infinite sequence of states a 


satisfies a temporal formula p, written cr ¥ p,ii 



(<T, 0) 1= if. The set of all sequences that satisfy a formula tp is denoted by C{p), 
the language of p. 

We say that a formula is a future (past) formula if it contains only state 
formulas, boolean connectives and future (past) temporal operators. We say 
that a formula is a general safety formula if it is of the form □ p, for a past 
formula p. 

A state formula p is called S-state valid if it holds over all .S-accessible states. 
A temporal formula p is called S-valid (valid over system S), denoted by 

S ^p , 



if it holds over all runs of S. 
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3 Alternating Automata 

Alternating automata are a generalization of nondeterministic automata. Non- 
deterministic automata have an existential flavor: a word is accepted if it is 
accepted by some path through the automaton. On the other hand V-automata 
[MP87] have a universal flavor: a word is accepted if it is accepted by all paths. 
Alternating automata combine the two flavors by allowing choices along a path 
to be marked as either existential or universal. 



Example Consider the two automata shown in Figure 1. An arc between two 
edges denotes an “and” choice: both paths have to be accepting. The absence of 
an arc denotes the (regular) “or” choice. 




■Aalt 







Fig. 1. An alternating automaton Aait and a nondeterministic automaton And- 



Automaton Aait accepts only sequences of the form 



(-, -), (p, a), (p, -), (p, -), {p, -),-■■ 

where a ” denotes a “don’t care”. For a sequence to be accepted, it has to 
be accepted by both branches. Automaton And, on the other hand, accepts 
sequences of the form 



(-, -), {P, -), {P, -), {P, -), {P, -),-■■ 



and sequences of the form 






Automata are usually defined with input symbols labeling the edges. How- 
ever, for our purposes it is more convenient to have them label the nodes. There- 
fore our definition of alternating automata is somewhat different from those 
found in [Var96,Var97]. 
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Definition 1 (Alternating Antomaton) An alternating automaton A is de- 
fined recursively as follows: 

A ::= empty automaton 

I (i/, S, /) single node 

I A /\ A conjunction of two automata 

I Ay A disjunction of two automata 

where i/ is a state formula, <5 is an alternating automaton expressing the next- 
state relation, and / indicates whether the node is accepting (denoted by -p) or 
rejecting (denoted by — ). We require that the automaton be finite. 

The set of nodes of an alternating automaton A, denoted by M{ A) is formally 
defined as 

A/'(e^) = 0 

<^, /) U A/'(<5) 

M {A\ A A2) = A/"(Ai) U jy {A2) 

a/'(AivA2) = jyiAi)ujy{A2) 

We denote with Afrej{A) the set of nodes of A that are rejecting, that is, 
AfrejiA) = {n G jy{A) I /(n) = -} . 

Example The automata shown in Figure 1 can be written as follows: 

Aait-{true,AiA{q,A2,+),+) and And ■ {true, AiW {q,A2,+),+) 
where 

A\={p,Ai,+) and A2 = {true,A2,+) ■ 

■ 

A path through a regular w-automaton is an infinite sequence of nodes. A 
“path” through an alternating w-automaton is, in general, a tree. To define the 
language of an alternating automaton, we first define a tree. 

Definition 2 A tree is defined recursively as follows: 

T ::= ct empty tree 

I T • T composition 

I (node,T) single node with child tree 

A tree may have both finite and infinite branches. 

Definition 3 (Rnn) Given an infinite sequence of states a : sq,si, . . ., a tree 
T is called a run of <t in A if one of the following holds: 



A = 


= 




and 


T 


= ct 




A = 


= n 




and 


T 


II 


So 1 = i'(n) and 










T' 


is a run of si 


, S2, . . . in 6 {n) 


A = 


= Ai 


A A2 


and 


T 


II 












Ti 


is a run of A 


1 and T2 is a run of A 


A = 


= Ai 


V A2 


and 


T 


is a run of Ai 


or T is a run of A2 
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Definition 4 (Accepting rnn) A run T is accepting if every infinite branch 
contains infinitely many accepting nodes. 



Example A run of the sequence 

o : {p,q),{p,q),{p,^q),{p,q),. . . 

in the automaton Aait of Figure 1 is shown in Figure 2. The run is clearly 
accepting since both branches contain infinitely many accepting nodes. ■ 



no 



/ 


\ 


ni 

1 


U2 


1 

ni 


ns 


ni 


ns 



Fig. 2. Run of <r : ip,q), ip,q), ip,^q), ip,q), ■ ■ ■ in 



Definition 5 (Model) An infinite sequence of states <t is a model of an alter- 
nating automaton A if there exists an accepting run of a in A. 

The set of models of an automaton A, also called the language of A, is 
denoted by C{A). 

4 Linear Temporal Logic: Future Formulas 

It has been shown that for every ltl formula (p there exists an alternating 
automaton A such that C{p) = C{A) and the size of A is linear in the size 
of p [Var97]. In [Var97] a construction method is given for such an automaton 
with propositions labeling the edges. Since we prefer to label the nodes with 
propositions (or, in our case, state formulas), we present a slightly different 
procedure. In the remainder of this paper we assume that all negations have 
been pushed in to the state level (a full set of rewrite rules to accomplish this is 
given in [MP95]), that is, no temporal operator is in the scope of a negation. 

Given an ltl formula p, an alternating automaton A(<^) is constructed, as 
follows. 

For a state formula p: 



•4{p) = {p,€a,+) ■ 
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For temporal formulas {p and ijj: 

Alp) = A((fi) A A{ip) 

Ai{ip V Ip) = A{ip) V A{'tp) 

-4(0 = {true,A{ip),+) 

A{\3p>) = {true,A(P3‘p),+)AA{p>) 

-4(0 = {true,A{<yp>),-)yA{p>) 

A{ipUxp) = A{tp)y{{true,A{p>Utp),—)AA{p>)) 
A{ipWip) = A{ip) V {{true, A{ipWip),+) A A{ip)) 

The constructions are illustrated in Figure 3. 




A{ipUp) AiipWp) 



Fig. 3. Alternating automata for the temporal operators D, 0> U ,y\> 



Note the close resemblance between these constructions and the expansion con- 
gruences [MP95]: 

□ PS <PA0D<P 

~ <pvoO<p 

ipU'tp PS p) V {(fi A Q{(pU tp)) 
ipWip PS Ip \/ (ip A C){ipWip)) 

Alternating automata represent these congruences completely, while in addition 
capturing the acceptance condition, not encoded in them. 

Example The automaton for the wait-for formula, for state formulas p, q, and 
r: 

p> '■ n(p — t qWr) 

is shown in Figure 4. A run for the sequence 

cr : (-'P, -, -), (p,g,-T),(-.p,g,-.r),(p,-.g,r), (-.p, -, -), (-.p, -, 
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is shown in Figure 5. 

The automaton for the formula 

ip ■■ \3{p ^ qUr) 

is identical to that of Figure 4 except that node ri 4 is rejecting. ■ 




Fig. 4. Alternating automaton for D(p — )■ qWr) 



It is easy to see that, without changing the language of the automaton, we 
can make the following simplifications to the construction, for state formula p: 

MUp) ={p,A(np),+) 

A(pUtp) = Aitp) V {{p,A{pU'tp),-) 

A{pWi>) = .4(V') V {{p,A{pWi>), +) 

In this case the automaton for n(p qWr) becomes the one shown in 
Figure 6. 

Theorem 1. For a future temporal formula p>, C,{ip) = C{A{ip)). 

Before we prove this theorem, we state a supporting lemma, which we will 
use without mention in the proof below. 

Lemma 1. For two automata Ai and A 2 , 
flj £(Ai)n£(A 2 ) = £(AiAA 2 ) 
f3j £(Ai) U £(A2) = £(AiVA2) 



Proof This follows directly from the definition of a run. 
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no ns ri4 



• • 




no ni ns n4 




no ni 



Fig. 5. Run 




.-4(D(p -)■ qWr)) 




Fig. 6. (Simplified) Alternating automaton for D(p — )■ qWr) 
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Proof of Theorem 1 The proof is by induction on the structure of the formula. 
In each of the cases we assume an arbitrary sequence of states a : so,s\, . . 
and denote the sequence s,, Sj+i, ... by <t*. 

— (^ is a state formula. In this case A = (<^, e^,+). A accepts all sequences 
whose first state satisfies which are exactly the sequences that satisfy (p. 

— (f = (fi A (f 2 ; ip = ipi ip 2 - These follow directly from Lemma 1. 

— (p = Qtp.ln this case A{p>) = {true , A{tp) , +) . We prove the two directions 
separately. 

• Assume a € C{A{p>)). Then a has an accepting run T in A{p>) such that 
T = {n, T'), and T' is a run of <t^ in A{tp). Clearly if T is accepting, so is 
T', and thus <t^ € C{A{'tp)). Then, by the inductive hypothesis, <t^ N tp, 
and thus, by the definition oi Q, a (p. 

• Assume a )= Qtp. Then, N tp, and, by the inductive hypothesis, <t^ € 
C{A{'tp)), and, by the definition of a run, a € £{{true, A{tp), +)). 

— In this case A{p>) = {true , A{p>) , +) A Aitp). We prove the two 
directions separately. 

• Assume a € C{A{(p)), and for the sake of contradiction, a ^\Z\ip- Then 
there must be some i > 0 such that ]^tp. But then, by the inductive 
hypothesis, <t* ^ C{A{'tp)), and, by the definition of A{n\'4’) (and by 
Lemma 1), <t* ^ T(-4(DV'))- For * = 0 this contradicts the assumption 
that a e C{A{ip)). For i > 0 we have that ^ C{{true,A{n\ip),+)) 
and thus (again by Lemma 1 and the definition of -4(DV')) that ^ 
•C(.4(nV'))- By downwards induction on i we can conclude that a ^ 
^{A{n\'4’)), a contradiction. Therefore <t N □ V'- 

• Assume <t 1= □ V'- Then, by the semantics of □, for all i > 0, <t* \=tp, and 
therefore, by the induction hypothesis, for all i > 0, <t* € C{A{'tp)). Let 
no be {true,A{\3'>P),+)- We construct a tree T as shown in Figure 7, 
where is an accepting run of <t* in A{tp), which exists since for all 
i > 0, (T* e C{A{'tp)). We claim that T is an accepting run of a in ^(D tp) 
and therefore a € C{A{p>)). Indeed, consider an infinite branch in T. This 
branch is either no, no, no, ... or it ends in a T^i . In any case it contains 
an infinite number of accepting nodes. 

— (p = tp . In this case A{p>) = {true, A{p>), — ) V A{tp). 

• Assume <t € £(A((p)) and let T be a run of <t in £(A((p)). Let no be 
{true,A((p), —). Then T has to be of the form no, . . . ,no,T', where T' 
is an accepting run of A{tp). (Note that it cannot be no, no, . . ., because 
no is rejecting.) Therefore there exists a suffix cr* of a such that <t* € 
£{A{'tp)). By the induction hypothesis, <t* N V' and therefore <t N ^ V'- 

• Assume <t N ^ V'- Then, by the semantics of there exists i > 0 such 
that (T* and by the inductive hypothesis, <t* € £{A{'tp)), and thus 
(by the definition of -4(^ ip) and Lemma 1) <t* € T(^(^ ip)). If i = 0 we 
are done. If i > 0 then we have that € £{{true , A{<(P> tp) , — )), and 
thus e T(^(^ tp)). By downwards induction on i we can conclude 
that a e £{£y Ip)- 

— (p = (p\U (p 2 ', P> = piWp 2 - The proof of these cases proceeds along the same 
lines as those for □ xp and ^ tp, and is omitted. 
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Fig. 7. Tree for a in A{\Z\ i>) 



5 Temporal Verification Rule for Future Safety Formulas 



Alternating automata can be used to automatically reduce the verification of an 
arbitrary safety property specified by a future formula to first-order verification 
conditions, where a safety property is defined to be a property (p, such that if 
a sequence a does not satisfy (p, then there is a finite prefix of a such that p is 
false on every extension of this prefix. 

We define the initial condition of an alternating automaton A, denoted by 
as follows: 



Sa{£a) 
Oa{{i^, S, /)) 
a A2) 
V A2) 



true 

V 



^a(-4i) a 0a{-A.2) 
6a{-^i) V 0a{A2) 



Intuitively, the initial condition of an automaton characterizes the set of initial 
states of sequences accepted by the automaton. For example, the initial condition 
of the automaton shown in Figure 4 is 



0a{-^) = qy r . 



Basic Rule 

Following the style of verification rules of [MP95] we can now present the basic 
temporal rule b-safe, shown in Figure 8. In the rule we use the Hoare triple 
notation {p} r {g}, which stands for p A Pr ^ q' ■ The notation {p} T {g} stands 
for {p} T {g} for all t eT- 

Premise Tl, the Initiation Condition, requires that the initial condition of S 
implies the initial condition of the automaton A{p). Premise T2, the Consecution 
Condition, requires that for all nodes, n G M{A{p)), and for all transitions 
T E T, T, if enabled, leads to the initial condition of the next-state automaton 
of n. 
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For a future safety formula ip and TS S : (V, &s,T), 
Tl. 0s ^ 6a(A(v)) 

T2. T {6^(<5(n))} for n 6 J\f{A{p)) 

Fig. 8. Basic temporal rule b-safe 



General Rule 

As is the case with the rules b-inv and b-wait in [MP95], rule b-SAFE is hardly 
ever directly applicable, because the assertions labeling the nodes are not induc- 
tive: they must be strengthened. To represent the strengthening of an automa- 
ton, we add a new label to the definition of a node, (/r, v, S, /), where is an 
assertion, and we change the definition of 9j( for a node into 

Using these definitions. Figure 9 shows the more general rule safe that allows 
strengthening of the intermediate assertions. 



For a future safety formula i^, TS 5 : (U, 0s,T), 
and strengthened automaton A{(p) 



TO. 


p{n) — )■ v{n) 


for n € Af(A(<p)) 


Tl. 


9s ^ eA(A(v)) 




T2. 


{p{n)}T {eA{S{n))} 


for n € Af(A(<p)) 


S i=(p 



Fig. 9. General temporal rule safe 



Note that terminal nodes, that is, nodes with <5 = e^, never need to be 
strengthened. This is so, because consecution conditions from terminal nodes 
are all of the form /r(n) A Pr ^ true, since 0j,{ej() = true, and thus trivially 
valid. 

Some strengthening can be applied automatically. For automata of the form 
.4 : no : {true. A, /) A .4i 
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node no can be strengthened with 6j^{A\) without changing the language of the 
automaton. In the following special cases we will apply this default strengthen- 
ing. 



Special Cases 

We consider the relation between rule safe and the special verification rules of 
[MP95]. 



INV 

For an invariance formula : Dp, with p a state formula, the (simplified) 
automaton ^(<p) consists of a single node no, labeled by p and with next-state 
automaton no. If we strengthen node no with x, rule safe produces the following 
verification conditions (after applying the default strengthening): 



TO. X ^ P for no 

Tl. 05 ^ X 

T2. {x]T{x} for no 

which is identical to rule iNV. 



WAIT 

For a wait-for formula, 

: n(p ^ qWr) 

rule SAFE, using the automaton shown in Figure 6 with node ns strengthened 
by X, results in the following verification conditions (after applying the default 
strengthening) : 



TO. X ^ Q for ns 

Tl. 05 -ip V r V X 

T2. {~'P V r V x} T {-ip V r V x} for no 

{x} T {x V r} for ns 

Comparing these conditions with those of rule wait we notice that the state- 
validity p X V r in wait has been replaced by the invariance conditions for 
the same formula, represented by Tl and the first set of conditions of T2. The 
other conditions are identical. 
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N-WAIT 

Consider the nested wait-for formula, with p, qo, ... ,qn state formulas: 

□ (p^g„W(g„_i...W(giWgo)...)) 

The procedure described above generates the (simplified) automaton shown in 
Figure 10 for this formula. It is easy to see that this automaton will generate the 
same verification conditions as given in nwait, again with the exception that 
the state validity in nwait is replaced by the invariance conditions for the same 
formula. 




Fig. 10. Alternating automaton for □(p — )• qnW{qn-i ■ ■ ■ W{qiWqo ) . . .)) 



Theorem 2 (Soundness of b-SAFe). For a TS S and future safety formula Lp, 
if the premises T1 and T2 of rule b-SAFE are S-state valid then S (p. 

Before we prove this theorem we present some supporting definitions and 
lemmas. 



Definition 6 (k-Run) A finite tree T is a k-run of an infinite sequence of states 
(T : So, si, • • • in an alternating automaton A if one of the following holds: 



-4 = 

.4 = n 



A — Ai A A 2 
A = Ai V A 2 



and T = ex 

and T = {n,T') and sq (= v{n) and 

(a) A: = 1 and T' = ep, or 

(b) A: > 1 and T' is a (A: — l)-run of in 5{n) 

and T = Ti • T 2 and T\ is a A:-run of a in .4i 

and T 2 is a A:-run of a in .42 

and T is a A:-run of a in .4i or T is a A:-run of a in .42 
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Definition 7 (k-Model) An infinite sequence of states <t is a A:-model of an 
alternating automaton A if there exists a A:-run of a in A. 

Example The sequence 

hp, -), {p, Q, ^r), 

is a 2-model of the automaton shown in Figure 4. Its 2-run is shown in Figure 11. 




hp , -> 



■{p, q, -'f > 



Fig. 11. 2-run of (-.p, -), (p, q, ^r), {-, 



Definition 8 (k-Nodes) The set of A:-nodes of a tree T, written Afk{T) consists 
of the nodes present at depth k. Formally: 

A4(eT) =0 

min.T)) 

MkiTi-T^) =A4(Ti) U A4(T2) 



Example For the tree T shown in Figure 5, M\{T) = {no,rii}, A/ 2 (T) = 
{no,n3,ri4}, Mz{T) = {no,ni,n3,ri4}, etc. ■ 

Lemma 2. For an infinite sequence of states a : so, si, k > 0, and alternat- 
ing automata A, Ai, and A 2 the following hold: 

— a is a k-model of e_ 4 . 

— a is a k-model of A = n iff sq \= iy(n) and is a (k — l)-model of S(n). 

— a is a k-model of A\ /\A 2 iff a is a k-model of A\ and a is a k-model of A 2 - 

— a is a k-model of A\ \/ A 2 iff cr is a k-model of A\ or a is a k-model of A 2 - 

Proof These follow directly from the definition of A:-run and A:-model. 
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Lemma 3 (Initial Condition). An infinite sequence of states a : sq, ■■■ is a 
l-model of an alternating automaton A iff sq ¥0j({A). 

Proof The proof is by induction on the structure of the automaton. 

— A = e_ 4 - By lemma 2, the sequence <t is a 1-model of A; Oa(£a) = and 
thus So N ^a(^)- 

— A = n. The sequence <t is a 1-model of .4 iff sq 1= v{n) iff sq h 6a{A). 

— A = .4i A .42- By lemma 2, the sequence <t is a 1-model of .4 iff <t is a 
1-model of .4i and <t is a 1-model of .42, iff, by the inductive hypothesis, 
So N ^*.a(-4i) and sq N 9a{A2), iff sq N 6<^(.4i) A 6*^ (. 42 ) iff, by the definition 
of Oa, So t= 6a{M a . 42 ). 

— .4 = .4i V .42- Similar to the above case. 



Lemma 4. If an infinite sequence of states a is a k-model of an alternating 
automaton A with k-run T, and if for all nodes n € Mk{T), cr* is a 1-model of 
S(n), then a is a (k l)-model of A. 

Proof Assume T is a A:-run of a in A, and for all n G A4(T) < 7 * is a 1- 
model of 6{n) with 1-run T„. Then we can construct a tree T' by replacing each 
occurrence of {n^cr) in T by (n,T„). It is easy to see that T' is a (A: -I- l)-run of 
(T in A, and thus <t is a (A: -I- l)-model of A. ■ 

Lemma 5. An infinite sequence of states <t : sq, si . . . is a k-model, with k > Q, 
of an alternating automaton A, with k-run T iff for all 1 < j < k, for all nodes 
n G Sj-i N i/(n). 

Proof By induction on k, and the structure of the tree T. ■ 

Example Consider Figure 5. The state S 2 ■ {^p,q,^r) indeed satisfies the 

assertion v of all nodes in AfsiT) = {no, 711 , 713 , 714 ). ■ 

Lemma 6. Given a TS S, a future temporal formula Lp, and an infinite sequence 
of states <7 G jC.(S), if the premises T1 and T2 are S-state valid, then a is a k- 
model of A(ip) for all k > 0. 

Proof The proof is by induction on k. 

— Base case. By the definition of a run of S, sq ^ Gs, and thus, by premise 
Tl, So N 9a{A{(p)). Then, by lemma 3, <t is a 1-model of A{(p). 

— Inductive step. Assume <t is a A:-model of A{(p) with A:-run T. Then, by 
lemma 5, for all n G Afk{T), st-i l= 7 /(n). By the consecution requirement 
for runs, there exists some transition t £T such that {sk-i,st) satisfies Pr- 
By premise T2, for every node n G A/’(.4((^)), every transition starting from 
a state that satisfies v{n) leads to a state that satisfies 6A{d{n)), and thus, 
by lemma 3 and lemma 4, cr is a (A: -I- l)-model of Ai'p). 
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Lemma 7. For a safety formula Lp, if a sequence of states a is a k-model of 
A({p) for any k > Q, then a is a model of A({p). 

Proof This follows directly from the definition of a safety formula: a safety 
property holds iff it cannot be violated in finite time. ■ 

Proof of Theorem 2 Consider a TS <S and a future safety formula and 
assume that the premises T1 and T2 are .S-state valid. Consider an arbitrary 
sequence of states <t € C{S), and future safety formula p>. By lemma 6, <t is a 
A:-model of Ai'p) for all A: > 0. Then, by lemma 7, <t is a model of Ai'p). Finally, 
by theorem 1, <j ¥ {p. ■ 



6 Past formulas 

In Section 4 we showed how to construct alternating automata for ltl formulas 
containing future operators only. Here we will extend the procedure to include 
past operators as well. 

To define an alternating automaton for ltl formulas including past operators, 
we add a component g to the definition of a node, such that a node is now defined 
as 

where g indicates whether the node is past (indicated by “t— ”) or future (indi- 
cated by 

To accomodate the presence of past nodes we extend the notions of a run 
and model of an automaton. 



Definition 9 (Rnn) Given an infinite sequence of states a : so^si, • • and a 
position j > 0, a tree T is called a run of a at position j if one of the following 
holds: 



A = ca 
A = n 



A — -4i A A 2 
A = -4i V A 2 



and T = ex 

and T = {n,T') and sq 1= i'{n) and 

' (a) T' is a run of a in 5{n) at j + 1, 
if g{n) =^, or 

< (b) T' is a run of a in 5{n) at j — 1, 
if g{n) =t— and j > 0, or 
. (c) T' = ex if g{n) =t- , /(n) = +, and j = 0. 
and T = Ti • T 2 , 

Ti is a run of Ai and T 2 is a run of A 2 
and T is a run of Ai or T is a run of A 2 



Definition 10 (Accepting rnn) A run T is accepting if every infinite branch 
of T contains infinitely many accepting nodes. 



Definition 11 (Model at j) An infinite sequence of states <t is a model of an 
alternating automaton A at position j if there exists an accepting run of <t in A 
at position j. 
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Definition 12 (Model) An infinite sequence of states <t is a model of an al- 
ternating automaton A if it is a model at position 0. 

Given an ltl formula (p, an alternating automaton A{(p) is constructed as before, 
where all nodes constructed before are future nodes, and with the following 
additions for the past operators: 



A{Qip) 

A(Qip) 

^(□^) 

A{(fi S tp) 
Ai{ip B Ip) 



{true, A{if), +, •<-) 

{true,A{(fi),-,-ir-) 

{true, ^(B (f),+, •<-) A A{(f>) 

{true, A{<^ ip), -, -(-) V A{ip) 

A{tp) V {{true, A{p> <S ^), —,•<—) A A{ip)) 
A{ip) V {{true, A{ip K ^), +, •<— ) A A{ip)) 



Again, these constructions closely resemble the expansion congruences for the 
past formulas: 



\E\ip - 


» ip h(Q\E\ip 


- 


» (pV Q<^ip 


ip S Ip ? 


» -tp V {(p A Q{(p S -tp)) 


ipB Ip ? 


» tp V {ip A Q {ip B tp)) 



Example For a causality formula ip : B(p ^- <$> r) with p, q, and r state 

formulas, the automaton is shown in Figure 12. ■ 




Fig. 12. Alternating automaton for Q(p — t <$> r) 



We denote with VM{A) the past nodes of A and with TM{A) the future 
nodes of A. 

We can now formulate the verification rule GSAFE, shown in Figure 13, that is 
applicable to arbitrary general safety formulas. Again, we augment the definition 
of a node with a strengthening p. 
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For a general safety formula ip, and TS S : (V, &s,T), 
and strengthened automaton A{(p) 



TO. 


pin) — )■ v{n) 


for n 6 N{A{‘p)) 


Tl. 


9s ^ eA(A(v)) 




T2. 


{fi{n)}T {eA{S{n))} 


for n 6 TM{A{ip)) 




{p{n)}T-^ {eA{6{n))} 


for n € VAf(A(<p)) 


T3. 


6s -'p(n) 


for n 6 VJ\fj.gj{A{ip)) 


Si=p 



Fig. 13. General safety rule gsafe 



Premise TO requires that the assertions used to strengthen the node imply the 
original assertions. Premise T1 requires that the initial condition of the system 
implies the (strengthened) initial condition of the automaton for (p. Premise T2 
requires the regular consecution condition for all future nodes, and the inverse 
consecution condition for all past nodes, where T~^ = | r € T} and 

{p}t~^ {g} = p' A pr ^ q ■ 

Finally, premise T3 requires that no initial state satisfies a rejecting past node. 
This last requirement ensures that “promises for the past” are fulfilled before 
the first state is reached. 



Special cases 



As for future formulas we compare the verification conditions produced by rule 
GSAFE with the premises of a special verification rule involving past operators 
presented in [MP95]. 



CAUS 



For a causality formula (p : n(p <$>r) with p, q, and r state formulas, rule 
GSAFE, based on the automaton shown in Figure 12 with node ri 2 strengthened 
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with X, results in the following verification conditions: 

TO. X true for ri 2 

Tl. 05 -ip V X V r 

T2. {~'P V X V r} T {-ip V X V r} for no 

{x} {X V r} for n -2 

T3. 05 ^ -X 

For state formulas p and r, the premises of the rule CAUS are the following: 

Cl. p ^ \ X r 

C2. 05 ->x V r 

C3. {x}r-MxVr} 

Premise Cl corresponds to premises Tl, and T2 for no. Premise C2 is represented 
by the (stronger) premise T3, and premise C3 is identical to premise T2 for n 2 - 

7 Discussion 

The work presented in this paper is a first attempt to use alternating automata 
as a basis for the deductive verification of ltl properties. We have shown that 
it successfully generalizes several of the special verification rules for the corre- 
sponding classes of formulas, thus obviating the need to implement these rules 
separately in a verification tool. Instead the single rule GSAFE suffices. The 
rule SAFE has been implemented in STeP, the Stanford Temporal Prover, a 
verification tool for algorithmic and deductive verification of reactive systems 
[BBC+95,BBC+00]. 

It is straightforward to extend rule safe to general reactivity properties by 
adding a fourth premise 

T3. □ ^ ~'p(n) for all rejecting nodes n € Afrej{A{(p)) 

which can be reduced to first-order verification conditions by one of the rules 
given in [MP91]. The reason we have omitted this extension from this paper 
is that we found this rule not useful for many of these properties. We are cur- 
rently working on a more general rule that allows the application of different 
proof methods on different parts of the automaton. This will be presented in a 
forthcoming paper. 



8 Related Work 

Alternating finite state automata over finite words were first introduced in 
[CKS81] and shown to be exponentially more succinct than nondeterministic 
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automata. In [MH84] it was shown that alternating finite automata on infinite 
words with Btichi acceptance conditions are as expressive as nondeterministic uj- 
automata with Btichi acceptance conditions, and in [MSS88,Var94] it was shown 
that for every ltl formula an alternating Btichi automaton can be constructed 
that is linear in the size of the formula. In [Var95,Var96,Var97] alternating au- 
tomata are used as the basis for a model checking algorithm for finite-state 
systems and both linear and branching time temporal logics. In [Var98] a theory 
of two-way alternating automata on infinite trees is developed to check satisfia- 
bility of /r-cal cuius formulas with both forward and backward modalities. In that 
paper a direction is introduced in the next-state relation, redirecting a run to the 
parent node when encountering a past operator. The procedure presented in this 
paper takes the opposite approach for past operators: it reverses the direction 
in the sequence, instead of in the automaton. 

Deductive verification methods using automata include a sound and complete 
proofrule based on V-automata [MP87], and generalized verification diagrams 
[BMS95,MBSU98]. 
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Abstract. Establishing relationships between primitives is an impor- 
tant area in the foundations of Cryptography. In this paper we con- 
sider the primitive of non-interactive zero-knowledge proofs of knowl- 
edge, namely, methods for writing a proof that on input x the prover 
knows y such that relation R{x, y) holds. These proofs have important 
applications for the construction of cryptographic protocols, as cryp- 
tosystems and signatures that are secure under strong types of attacks. 
They were first defined in PH, where a sufficient condition for the exis- 
tence of such proofs for all NP relations was given. In this paper we show, 
perhaps unexpectedly, that such condition, based on a variant of public- 
key cryptosystems, is also necessary. Moreover, we present an alternative 
and natural condition, based on a variant of commitment schemes, which 
we show to be necessary and sufficient as well for the construction of such 
proofs. Such equivalence also allows us to improve known results on the 
construction of such proofs under the hardness of specific computational 
problems. Specifically, we show that assuming the hardness of factoring 
Blum integers is sufhcient for such constructions. 



1 Introduction 

Understanding the conditions under which cryptographic applications are pos- 
sible is of crucial importance for both the theoretical development of Gryptog- 
raphy and the concrete realization of real-life cryptographic systems. Modern 
complexity-based Gryptography is based on intractability assumptions such as 
the existence of several basic primitives, the most basic being perhaps that of the 
existence of one-way functions (functions that can be evaluated in polynomial 
time but that no polynomial time algorithm can invert). Since the first days of 
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modern Cryptography a main research area has been to establish relationships 
between all primitives used in this field. In some cases, researchers have been 
successful in presenting necessary and sufficient conditions for the existence of 
such primitives. For instance, it is known that the existence of one-way func- 
tions is necessary (|^) and sufficient for the construction of other primitives 
as pseudo-random generators eg, commitment schemes enEi, pseudo-random 
functions eng, signatures schemes imi, zero-knowledge proofs of member- 
ship for all languages in NP and zero-knowledge proofs of knowledge for all NP 
relations 1 1 tif I Df2,S|2tyj . Moreover, some other primitives, such as public-key cryp- 
tosystems [0|, are known to exist under the necessary and sufficient assumption 
of the existence of one-way functions with trapdoors. 

The object of this paper is to consider the primitive of non-interactive zero- 
knowledge (nizk) proofs of knowledge. These proofs, first defined in [113, have im- 
portant applications for the construction of cryptographic protocols with strong 
security, as public-key cryptosystems secure under chosen ciphertext attack 
and digital signatures secure under existential forgery In PH a sufficient 
condition was presented for the existence of nizk proofs of knowledge for all NP 
relations, and it was given strong evidence that it is not possible to construct 
nizk proofs of knowledge using one-way functions only. 

Summary of main contributions. We study necessary and sufficient condi- 
tions for the existence of nizk proofs of knowledge for all NP relations. Our main 
contributions can be summarized as follows: 

In |1 0] it is shown that the existence of dense seeure cryptosystems and of 
non-interactive zero-knowledge proofs of membership for all languages in NP is 
sufficient for the existence of nizk proofs of knowledge for all NP relations. Here, 
we show that, perhaps unexpectedly, such assumption is also necessary. 

We introduce the concept of an extractable commitment (a natural concept, 
dual to those of equi vocable commitment, used in mm and chameleon com- 
mitments of ig) and show that the notions of extractable commitments and 
dense-secure cryptosystems are equivalent. It follows that the existence of ex- 
tractable commitments and of nizk proofs of membership for all languages in 
NP is necessary and sufficient for the existence of nizk proofs of knowledge for 
all NP relations. 

We then show that assuming difficulty of factoring Blum integers is suffi- 
cient for constructing extractable commitments and thus nizk proofs of knowl- 
edge. Previous constructions of nizk proofs of knowledge were based on provably 
non-weaker assumptions of the difficulty of inverting the RSA function and of 
deciding the quadratic residuosity predicate modulo Blum integers. 

Organization of the paper. Definitions are in Section El the sufficient condi- 
tion based on extractable commitment schemes is given in Section 0 the suffi- 
cient condition based on the hardness of factoring Blum integers is in Section 0 
the proof that the two conditions based on dense cryptosystems and extractable 
commitment schemes are also necessary for the construction of nizk proofs of 
knowledge for all NP relations is in Section 0 

Basic notations, definitions and most proofs are omitted for lack of space. 
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2 Definitions 

We present the definitions of cryptographic primitives of interest in this paper. 
We review nizk proofs of membership, nizk proofs of knowledge and dense en- 
cryption schemes and define the new notion of extractable commitment scheme. 
Nizk proofs of membership. A nizk proof system of membership for a certain 
language L is a pair of algorithms, a prover and a verifier, the latter running in 
polynomial time, such that the prover, on input string x and a public random 
string, can compute a proof that convinces the verifier that the statement ‘a: G U 
is true, without revealing anything else. Such proof systems satisfy three require- 
ments: completeness, soundness and zero-knowledge. Completeness states that 
if the input x is in the language L, with high probability, the string computed 
by the prover makes the verifier accept. Soundness states that the probability 
that any prover, given the reference string, can compute an x not in L and a 
string that makes the verifier accept, is very small. Zero-knowledge is formalized 
by saying that there exists an efficient algorithm that generates a pair which has 
distribution indistinguishable from the reference string and the proof in a real 
execution of the proof system (see 0 for a formal definition) . 

Nizk proofs of knowledge. A nizk proof system of knowledge for a certain 
relation i? is a pair of algorithms, a prover and a verifier, the latter running in 
polynomial time, such that the prover, on input string x and a public random 
string, can compute a proof that convinces the verifier that ‘he knows y such that 
R{x,y)\ It is defined as a nizk proof system of membership for language domR 
that satisfies the following additional requirement, called validity: there exists 
an efficient extractor that is able to prepare a reference string (together with 
some additional information) and extract a witness for the common input from 
any accepting proof given by the prover; moreover, this happens with essentially 
the same probability that such prover makes the verifier accept. The formal 
definition from m follows. 

Definition 1. Let P a probabilistic Turing machine and V a deterministic poly- 
nomial time Turing machine. Also, let i? be a relation and be the language 
associated to R. We say that (P,V) is a non-interactive zero-knowledge proof 
system of knowledge for R if it is a nizk proof of membership for Ln and if there 
exists a pair of probabilistic polynomial time algorithms such that for 

all algorithms P' , for all constants c and all sufficiently large n, po — p\ < n~^, 
where po^Pi are, respectively, 

Prob[ (<T, aua;)-<— Ao(l"); {x, Proof ) <^ P' (a); w <— Ei{a, aux, x, Proof ) : {x,w) £ i?]; 

Prob[ cr-«— {0, 1}*; (s, Proof) <— P' (a) : V {a, x, Proof) = 1]. 



Public-key cryptosystems. We now review the definition of public-key cryp- 
tosystems, as first defined in ini. A public-key cryptosystem is a triple of poly- 
nomial time algorithms (KG,E,D). The algorithm KG, called the key-generation 
algorithm, is probabilistic; on input a security parameter 1" and a random string 
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r of appropriate length, KG returns a pair {pk, sk), where pk is called the public- 
key, and sk is called the secret-key. The algorithm E, called the encryption al- 
gorithm, is probabilistic; on input pk, a bit b, called the plaintext, and a random 
string s of appropriate length, E returns a string ct, called the ciphertext. The 
algorithm D, called the decryption algorithm, is deterministic; on input sk,pk 
and a string ct, D returns a bit b, or a special failure symbol T. The algorithms 
KG,E,D have to satisfy the two requirements of correctness and security, defined 
as follows. Correctness means that for all constants c, all sufficiently large n, and 
all b G {0, 1}, Pb > 1 — where pb is equal to 

Prob [ r, s<— {0, 1}*; {pk, sfc) KG(1", r); E(pfc, s, b); D(pfc, sk, ct) : b' = . 

Security means that for any probabilistic polynomial time algorithm A' , for all 
constant c, and any sufficiently large n, — q\ is negligible, where 

qb = Prob [r,s<— {0, 1}*; {pk, sk) -(—KG{C ,r)-,ct-(—E{pk, s,b) : A'{pk,ct) = b] . 



Dense public-key cryptosystems. A dense public-key cryptosystem is de- 
fined starting from a public-key cryptosystem, and performing the following two 
variations. First, the public key output by the key-generation algorithm is uni- 
formly distributed over a set of the type {0, 1}^, for some integer k. Second, the 
security requirement holds for at least a noticeablcE set of public keys output by 
the key-generation algorithm, rather than for all of them. 

Definition 2. Let (KG,E,D) be a triple containing the above defined algo- 
rithms, as follows; namely, KG is a key-generation algorithm, E is an encryption 
algorithm and D is a decryption algorithm. Also, let (KG,E,D) satisfy the above 
defined correctness requirement. We say that (KG,E,D) is a S-dense public-key 
cryptosystem if the following two requirements hold: 

1. Uniformity. For all n G J\f, and all constants c, it holds that 

I Prob[a-<— D„] — Prob[a-4— Gn] I <n~‘^, 

a 

where = {r-<—{0, 1}*; (p/c,sA:)^KG(l”,r) : pfc} and 
C/„ = {r^{0,l}*;(p/c,sfc)^KG(l",r);s^{0,l}IP'=l : s}. 

2. 6-Security. For any m G M, there exists a set Hardm C {0, 1}'?!’”) of size 

6{m) ■ for some noticeable function 6, such that for any probabilistic 

polynomial time algorithm A = {A\, A 2 ), for any constant c, any sufficiently 
large n, and any r G Hardn, 

Prob [ {pk, sfc)<— KG(1", r); {mo, mi, au®) Ai(pfc); b-«— {0, 1}; 

c<^E{pk,mb)',d<^ A 2 {pk,c,aux) ■. d = b ] < l/2-|-n~'^. 

Commitment schemes. A bit commitment scheme (A,B) in the public random 
string model is a two-phase interactive protocol between two probabilistic poly- 
nomial time parties A and B, called the committer and the receiver, respectively, 

^ A function / : A/” — ^ A/” is noticeable if there exists a constant c such that for all 
sufficiently large n, f{n) > n~‘^. 
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such that the following is true. In the first phase (the commitment phase), given 
the public random string a, A commits to a bit b by computing a pair of keys 
{com, dec) and sending com (the commitment key) to B. In the second phase 
(the decommitment phase) A reveals the bit b and the key dec (the decommit- 
ment key) to B. Now B checks whether the decommitment key is valid; if not, 
B outputs a special string _L, meaning that he rejects the decommitment from 
A; otherwise, B can efficiently compute the string s revealed by A. (A,B) has 
to satisfy three requirements: correctness, security and binding. The correctness 
requirement states that the probability that algorithm B, on input a, com, dec, 
can compute the committed bit b after the reference string a has been uniformly 
chosen, and after algorithm A, on input a,b, has returned pair {com, dec), is at 
least 1 — e, for some negligible function e. The security requirement states that 
given just a and the commitment key com, no polynomial-time receiver B' can 
distinguish with probability more than negligible whether the pair (cr, com) is a 
commitment to 0 or to 1. The binding requirement states that after a has been 
uniformly chosen, for any algorithm A' returning three strings com' , deco, dec\, 
the probability that decb is a valid decommitment for a, com' as bit b, for b = 0,1, 
is negligible. 

Extractable Commitment Schemes. A commitment scheme is extractable if 
there exists an efficient extractor algorithm that is able to prepare a reference 
string (together with some additional information) and extract the value of the 
committed bit from any valid commitment key sent by the prover; here, by 
a valid commitment key we mean a commitment key for which there exists a 
decommitment key that would allow the receiver to obtain the committed bit. 
A formal definition follows. 

Definition 3. Let (A,B) be a commitment scheme. We say that (A,B) is ex- 
tractable if there exists a pair {Eq,Ei) of probabilistic polynomial time algo- 
rithms such that for all algorithms A' , for all b G {0, 1}, for all constants c and 
all sufficiently large n, it holds that po — pi < n~", where po,Pi are, resp., 

Prob[ (cr, aux) Eo{l")-, {com, dec) A'(cr); b<^Ei{a, aux, com ) : B{o, com, dec) = h]-, 

Prob [ o--«— {0, 1}*; {com, dec) A' {a, b) : B{a, com, dec) = & ] . 



3 Sufficient Conditions for Nick Proofs of Knowledge 

In this section we present an additional sufficient condition for the construction 
of nizk proofs of knowledge for any NP relation. Formally, we obtain the following 

Theorem 1. Let i? be a polynomial time computable relation and let Lr be 
the language associated to R. If there exists a nizk proof system of membership 
for Ln and there exists an extractable commitment scheme then it is possible to 
construct a nizk proof of knowledge for R. 
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In order to prove the above theorem, we start by showing that the following two 
cryptographic primitives are equivalent: extractable commitment schemes and 
dense public-key cryptosystems. More precisely, we show the following 

Lemma 1. Let (5 be a noticeable function. There exists (constructively) an ex- 
tractable commitment scheme if and only if there exists (constructively) a S- 
dense public-key cryptosystem. 

Theorem Q] follows then directly from Lemma Q] and the main result in cni, 
saying that the existence of dense public-key cryptosystems and the existence 
of nizk proofs of membership for all languages in NP implies the existence of 
a nizk proof of knowledge for all polynomial time relations R. The proof of 
Lemma Eis described in Sections tt. II and 15.21 We note that one could define a 
notion of 5-dense trapdoor permutations (by modifying the definition of trapdoor 
permutations as one modifies the definition of public-key cryptosystems to define 
their i5-dense variant). As a direct corollary of the techniques in this paper, and 
using the fact that the existence of trapdoor permutations is sufficient for the 
existence of nizk proofs of membership for all languages in NP H3|, we would 
obtain that the existence of (5-dense trapdoor permutation (alone) is a sufficient 
condition for the existence of nizk proofs of knowledge for all NP relations. 
We now describe two corollaries about constructing extractable commitment 
schemes using both general complexity-theoretic conditions and also practical 
conditions based on the conjectured hardness of specific computational problems. 
In the first case we have a negative result and in the second some positive ones. 

Constructions under general conditions. We first observe that J-dense 
public- key cryptosystems imply the existence of (1 — e)-dense public- key cryp- 
tosystems, where e is a negligible function (this follows by using independent 
repetition of the i5-dense cryptosystem, as in Yao’s XOR theorem ^Hl)- Then, 
using the result in m saying that constructing (1 — e) -dense public- key cryp- 
tosystems based on one-way permutations only is as hard as proving that P ^ 
NP, and our Lemma Q we obtain the following 

Corollary 1. Constructing extractable commitment schemes based on one-way 
permutations only is as hard as proving that P ^ NP. 

We note that the above result is especially interesting in light of the fact that, 
based on one-way functions only, it is known how to construct commitment 
schemes (using results from In other words, this shows that constructing 

extractable commitment schemes from any commitment scheme can be as hard 
as proving that P yf NP. 

Constructions under practical conditions. Combining Lemma [D with 
results in nm, we obtain the following 

Corollary 2. Let 5 be a noticeable function. If breaking the RSA cryptosystem, 
or the decisional Difhe Heilman problem or deciding quadratic residuosity is 
hard, then it is possible to construct an extractable commitment scheme. 
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3.1 Prom Extractable Commitments to Dense Cryptosystems 

In this subsection we prove one direction of the main statement in Lemma Q 
Given an extractable commitment scheme (A,B,(Eo,Ei)), we construct a J-dense 
public-key cryptosystem (KG,E,D), for some noticeable function S. 

Constructing the dense cryptosystem. The basic idea of the construc- 
tion of the dense public-key cryptosystem is to use the extractor both for the 
key-generation and the decryption algorithms. Specifically, the key-generation al- 
gorithm KG is obtained by running the first extractor algorithm Eq, and outputs 
the reference string a returned by Eq as a public key and the private information 
aux as a secret key. The encryption of message m is obtained by performing a 
commitment to m using algorithm A and the public key output by KG as a 
reference string. The decryption is obtained by running the second extractor 
algorithm Ei on input the ciphertext returned by the encryption algorithm, and 
the public and secret keys returned by KG. A formal description follows. 

The algorithm KG. On input a security parameter 1" and a sufficiently long 
random string r, algorithm KG sets {a, aux) =Eo(l",r), pk = a, sk = aux and 
outputs: (pk,sk). 

The algorithm E. On input a public key pk and a message m, algorithm E 
sets cr = pfc, {com, dec) = A{a, m), ct = com, and outputs: ct. 

The algorithm D. On input public key pk, secret key sk and ciphertext ct, al- 
gorithm D sets a = pk, aux = sk, com = ct, mes =Ei (cr, aux, com) and outputs: 
mes. 

3.2 From Dense Cryptosystems to Extractable Commitments 

In this subsection we prove the other direction of Lemma 0 Given a J-dense 
public- key cryptosystem (KG,E,D), for some noticeable function 6, we construct 
an extractable commitment scheme (A,B,(Eo,Ei)). 

Constructing the extractable commitment scheme. The basic idea of 
the construction of the extractable commitment scheme is to use a portion of the 
reference string as specifying several public keys for the encryption scheme, and 
to commit to a bit b by encrypting such bit according to any of such public keys. 
The first extractor algorithm would prepare the reference string by running the 
key generator algorithm several times, by keeping the secret keys private and 
setting the reference string equal to the sequence of public keys thus obtained. 
The second extractor algorithm would run the decryption algorithm to decrypt 
all encryptions that are in the commitment key; if all such decryptions return 
the same bit, then the algorithm outputs such bit. Let m = n/S{n). 

The algorithm A. On input reference string tr and bit b, algorithm A does the 
following. First, A writes tr as cr = cti o ••• o Om, where |tTi| = n, for i = 1, . . . ,m. 
Then, A sets pki = Oi and computes cti = E{pki,b) using a uniformly chosen 
string Vi as a random string for algorithm E, for i = 1, . . . ,m. Finally E sets 
com = {cti, ■ ■ ■ , ctm) and dec = (ri, . . . , r^), and outputs: {com, dec). 
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The algorithm B. On input reference string cr, and strings com, dec, algorithm 
B does the following. First B writes cr as u = pki o ■ ■ ■ o pk^, com as com = 
{comi, . . . , comm), and dec = {deci, . . . , decm), Then B verifies that there exists 
a bit b such that comi = E[pki,b) using deci as a random string. If all these 
verifications are successful then B outputs: b, else B outputs: _L. 

The algorithm Eg. On input 1", algorithm Eg does the following. Eg lets 
{pki, ski) = KG(1"), for i = 1, ... ,m, and using each time independently chosen 
random bits. Finally Eg sets a = pki o ■ ■ ■ o pkm and aux = {ski, . . . , skm), and 
outputs: {a, aux). 

The algorithm Ei. On input reference string a, and strings com, aux, algo- 
rithm El does the following. First algorithm Ei writes cr as tr = pki o • • • opkm, 
com as com = (cotoi, . . . , comm), and aux as aux = {ski, ■ ■ ■ , skm)- Then Ei 
checks if there exists a bit b such that b = T){pki, ski, comi), for i = 1, . . . , m. If 
so, then Ei outputs: b; if not, then Ei outputs: _L. 



4 A Practical Sufficient Condition: Hardness of Factoring 

In this section we consider the problem of constructing nizk proofs of knowledge 
for all NP relations based on practical conditions, namely, using the conjectured 
hardness of specific computational problems. We show that the hardness of fac- 
toring Blum integers (BF assumption) is sufficient for this task. 

Theorem 2. If the BF assumption holds then it is possible to construct a nizk 
proof system of knowledge for all polynomial-time relations. 

The theorem follows from a construction of i5-secure extractable commitment 
schemes from the BF assumption, shown in Section 14. II an application of 
Lemma [0 and the result in [El, which implies that the BF assumption is suf- 
ficient for constructing a nizk proof of membership for all languages in NP. As 
another application of this fact and Lemma E we obtain: 

Corollary 3. If the BF assumption holds then it is possible to construct a S- 
dense public-key cryptosystem, for some noticeable function S. 

4.1 An Extractable Commitment Based on Factoring 

We now present a construction of an extractable commitment scheme based on 
the hardness of factoring Blum integers (we remark that none of the previous 
commitment schemes based on such assumptions is extractable). 
Constructing the extractable commitment scheme. We informally de- 
scribe the basic ideas of our construction. First, a portion of the reference string 
is used as an integer x that is a Blum integer (note that by the results on the 
distribution of primes, it follows that the probability that a uniformly chosen 
n-bit integer satisfies this property is at least f2{l/'n?) and therefore a notice- 
able function). Then the squaring function is used in order to hide a value r 
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uniformly chosen among the elements in Z* smaller than x/2 (note that this 
mapping defines a one-to-one function). Then, if b is the bit to be committed, 
the committer chooses an n-bit random string s such that r © s = c and deter- 
mines (s, mod x,c(B b) as a commitment key and r as a decommitment key 
(here, the 0 operation is used to define a hard-core bit, as from the result of 
m)- Two levels of repetitions are required: the first ensures that the probabil- 
ity that two square roots of the same integer satisfy the 0 equation for different 
values of c; the second ensures that at least one modulus from the public random 
string is a Blum integer. Now we formally describe the scheme. We will denote 
by S the probabilistic polynomial time algorithm from which, on input 1", 
outputs an n-bit integer x, together with its factorization fact. 

The algorithm A: On input the reference string tr and a bit b, algorithm A 
does the following. A writes ct as cr = tri o • • • o a„i, for rn = n^, and repeats 
the following for each j = 1, . . . ,m (using independently chosen random bits). 
A writes aj as aj = Xj o sij o ■ ■ ■ o S 2 n,j, where Xj,sij,...,S 2 n,j G {Oj !}"• 
Then A randomly chooses ri^ , . . . ,r 2 nj G Z*. such that rij, . . . ,r 2 n,j < Xj/2. 
Now, for i = 1, . . . ,2n, A computes = Sij © r^, if 6 = 0, and Cy = 1 — 
{sij 0 rij), if 6 = 1. Finally A computes Zij = mod x, for i = l,...,2n, 
sets corrij = {{zij, sij,Cij), . . . , {z 2 n,j, S 2 n,j,C 2 n,j)) and decj = (ry, . . . , r 2 „y), 
com = {comi , . . . , cornea) and dec = (deci, . . . , dec„3) and outputs: {com, dec). 

The algorithm B: On input the reference string tr, and two strings com, dec, 
algorithm B does the following. B writes cr as cr = CTi o • • • o a^, for m = n^, and 
repeats the following for each j = 1, . . . , m (using independently chosen random 
bits). B writes each Oj as aj = Xj o sij o ■ ■ ■ o S 2 n,j, where xj, sij, . . . , S 2 n,j G 
{0, 1}". Then B checks that com can be written as com = {com\, . . . , comns), and 
dec can be written as dec = {dec\, . . . , deCns), where comj = {{zij, sij,cij), . . . , 

{^2n,j^^2n,ji^2n,j))->deCj {v\j , . . . ,V2n,j) •> ^ij ^'^ij ^ ^ ^ 

and Zij = rfj mod Xj, for j = 1, . . . ,n^ and i = 1, . . . , 2n. If any of the above 
verifications is not satisfied, then B outputs: T and halt. Otherwise, for each 
j = 1, ... , V? , B checks that either Sij Qrij = Cij, for i = 1, . . . , 2n, in which case 
B sets = 0 or Sij 0 = 1 — Cij, for i = 1, . . . , 2n, in which case B sets bi = 1. 

If 5i = • • • = b 2 n then B sets b = b\ else B sets h =T. Finally, B outputs: b. 

The algorithm Eq: On input 1", algorithm Eg does the following. Eg runs 
times algorithm S on input 1" (each time using independently chosen random 
bits), and lets ((xi, /octi ),..., (a;„3, /act„3)) be its output. Then Eg sets a = 
xi o • • • o x„3, sets aux = {facti, . . . , factns), and outputs: (cr, aux). 

The algorithm Ei: On input a, com, aux, algorithm Ei does the following. 
El writes cr as cri, . . . ,cr„3, and each aj as aj = Xj o s\j o ■ ■ ■ o S2n,j, where 
Xj is an n-bit integer, and com as com = comi o ■ ■ ■ o com^s , where comj = 
{{^Ij ^ ) 7 • ■ • 7 (^2n,j7 ^2n,jj C2n,j))'> for Zij € Z,^, Sij € {0, 1} and Cij € {0, 1}, 

for i = 1, . . . , 2n and j = 1, ... ,n^. Ei checks that com can be written as above 
and that auxj is the factorization of Xj, for j = l,...,n^. If these verifications 
are not satisfied, Ei outputs: T and halts. Otherwise, Ei uses aux to compute 
r\j, . . . ,r 2 n,j S Z* such that Vij < x/2, r/j = mod x, and it holds that 
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Tij 0 Sij = Cij © 6', for i = 1, , 2n, j = 1, . . . ,n^ and b' G {0, 1}. Ei sets b = b' 
if such a b' exists, or b =_L otherwise, and outputs: b. 

5 Necessary and Sufficient Conditions 

In this section we consider the problem of providing necessary conditions for 
the construction of nizk proofs of knowledge for any NP relation. We show that 
both sufficient conditions considered in Section 0are also necessary, and therefore 
equivalent. Formally, we obtain the following 

Theorem 3. Let (5 be a noticeable function, let i? be a polynomial time com- 
putable relation, let be the language associated to R, and assume that 
Lji ^ BPP. The following three conditions are equivalent: 

1 . the existence of a nizk proof of knowledge for R; 

2. the existence of a (5-dense public- key cryptosystem and the existence of a 
nizk proof system of membership for L/j; 

3. the existence of an extractable commitment scheme and the existence of a 
nizk proof system of membership for Ln. 

First of all we remark that the assumption made in the above theorem that 
Lr ^ BPP is wlog. Indeed, if Lr is in BPP, then a trivial and unconditional 
construction for a nizk proof of knowledge for R exists (namely, the prover 
sends no proof to the verifier who can compute by himself a y such that R{x, y) 
holds, for any input x). Therefore, we continue our discussion by assuming that 
Lr ^ BPP. In order to prove the above theorem, it is enough to show that the 
one of the two conditions shown to be sufficient in Section 0 is indeed necessary 
as well. We do this in the rest of the section with extractable commitment 
schemes. We divide the presentation in two steps. Since the existence of nizk 
proofs of knowledge for a relation R implies, by definition, the existence of a 
nizk proof of membership for the associated language L^, it is enough to prove 
that 1) the existence of nizk proofs of knowledge for a relation R associated 
to a language not in BPP implies the existence of commitment schemes, and 
2) the existence of commitment schemes and of nizk proofs of knowledge for 
all polynomial time relations implies the existence of extractable commitment 
schemes. Clearly, Theorem 0 follows by combining these statements. 

Nizk proofs of knowledge imply commitment schemes. We first discuss 
the possibility of proving that the existence of a nizk proof of knowledge for 
relation R implies the existence of a one-way function. Notice that we cannot 
use the result of |2B| since it would give such a result by making the additional 
assumption that the language Lr is hard on average; namely, it is not in average- 
BPP (which is likely to be a class larger than BPP). Instead, we give the following 
direct construction. Let (P,V) be the nizk proof of knowledge for R and let 
(Eo,Ei) be the extractor associated to (P,V). Moreover, let m{n) be an upper 
bound on the number of random bits that Eq uses when taking as input security 
parameter 1”. Since m(n) can be chosen as a polynomial, there exists a function 
I : Af ^ Af such that m{l{n)) = n for all n G Af, and I is itself bounded by a 
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polynomial. Then, we define the function ensemble / = {/n}nej\fj where for all 
X £ {0, 1}", it holds that fn{x) = cr, such that (a,aux) = a;). We can 

prove that if Lji is not in BPP then / is a one-way function (intuitively, observe 
that if this is not the case, then V can invert / and use the proof received by 
P in order to obtain w with some not negligible probability, thus contradicting 
the zero-knowledge property of (P,V)). A construction of a commitment scheme 
in the public random string model starting from any one-way function can be 
obtained by using the results in |1 and observing that the commitment 
scheme in 1231 can be modified so to require only one move in the public random 
string setting. 

Commitment schemes and nizk proofs of knowledge imply extracta- 
ble COMMITMENT SCHEMES. We assume the existence of a commitment scheme 
(C,D) in the public random string model and the existence of a nizk proof of 
knowledge (P,V) for any polynomial time relation R, with an associated extrac- 
tor (Exto,Exti). We would like to construct an extractable commitment scheme 
(A,B,(Eo,Ei)). The basic idea of our construction is to have algorithm A com- 
mit to a bit b by using algorithm C and then using algorithm P to prove the 
knowledge of a valid decommitment key for the commitment key output by C. 
The algorithm A: On input the reference string a and a bit b, algorithm A 
does the following. A writes cr as tr = tri o a 2 , sets (comi, deci) = C{ai, b) and 
defines relation i ?2 = b) = {(c, d) \ D{a, c, d) = b}. Then A runs algorithm 

P using com I as public input, deci as private input, and ct 2 as a reference string 
in order to compute a proof of knowledge proof of deci such that {comi , dec \ ) 
belong to relation i? 2 - Finally A sets com = {comi, proof) and dec = dec\ and 
outputs: {com, dec). 

The algorithm B: On input the reference string tr, and two strings com, dec, 
algorithm B does the following. B writes cr as ct = cti o (T 2 and com as com = 
{comi, proof), and defines relation i ?2 = R 2 {a,b) = {{c, d) \ D{a,c,d) = b}. 
Then B verifies that D{ai, comi, dec) y^-L, and V{u 2 , comi, proof) = 1- If any 
of the above verifications is not satisfied B outputs: T else B computes b = 
D((Ti, comi, dec) and outputs: b. 

The algorithm Eq: On input 1", algorithm Eg uniformly chooses cri, computes 
((72, aux) = Exto(l") and sets cr = tJi o ct 2 . Finally Eg outputs: (cr, aux). 

The algorithm Ei: On input cr, com, aux, algorithm Ei writes cr as cri oct 2 and 
com as com = {comi, proof). Then Ei sets deci = Rixti{a 2 , comi, proof, aux) 
and computes b= D{ai, comi, deci) . Finally Ei outputs: b. 
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Abstract. Under a computational assumption, and assuming that both 
Prover and Veriher are computationally bounded, we show a one-round 
(i.e.. Verifier speaks and then Prover answers) witness-indistinguishable 
interactive proof for NP with poly-logarithmic communication complex- 
ity. A major application of our main result is that we show how to check 
in an efficient manner and without any additional interaction the cor- 
rectness of the output of any remote procedure call. 



1 Introduction 

Under a computational assumption, and assuming that both Prover and Verifier 
are computationally bounded, we show a one-round (i.e., Verifier speaks and 
then Prover answers) witness-indistinguishable interactive proof for NP with 
poly-logarithmic communication complexity. That is, the properties of our one- 
round interactive proof (argument) are as follows: 

— Perfect Completeness: On a common input x G L for any L in NP if Prover is 
given a witness as a private input, it convinces Verifier (i.e. Verifier accepts) 
with probability one. 

— Computational Soundness: li x ^ L, no poly-time cheating Prover can make 
the Verifier accept except with negligible probability. 

— Protocol is Short : the communication complexity (of both messages) is poly- 
logarithmic in the length of the proof. 

— Witness-indistinguishability: The interactive proof is witness indistinguish- 
able: If there is more then one witness that x G L then for any two witnesses 
no poly-time cheating Verifier can distinguish which was given to a Prover 
as a private input. 

Our result improves upon be st p revious result s on sh ort computationally-sound 
proofs (arguments) of Micali ^O] and of Kilian mm which required either three 
messages of interaction or the assumption of the existence of a random oracle. 
Our result also improves the communication-complexity of one-round witness- 
indistinguishable arguments (“Zaps”) of Dwork and Naor [21 which required 
polynomial communication complexity in the length of the proof. One important 
difference in our protocol vs. 0 i® that our protocol is private-coin protocoQ 
where as the Dwork and Naor protocol is public-coin. 

^ Private-coin protocol is the one where Verifier tosses coins but keeps them secret 
from the prover, where is public-coin protocols are the ones where Verifier can just 
publish its coin-flips. 
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A major corollary of our result is that we show how to check in an efficient 
manner and without any additional interaction the correctness of the output of 
any remote procedure call. That is, if Alice requests Bob to execute any remote 
code and give back the result, Alice can also provide to Bob (together with the 
code) a very short public- key and receive back from Bob not only the result 
of the execution but also a very short and easily^ verifiable certificate that the 
output was computed correctly. We stress that our scheme for verifiable remote 
procedure call is round-optimal: there is only one message from Alice to Bob 
(i.e. the procedure call) and only one message back from Bob to Alice (with 
the output of the procedure call and the certificate of correctness). Our scheme 
guarantees that if Bob follows the protocol then Alice always accepts the output, 
but if the claimed output is incorrect Bob can not construct a certificate which 
Alice will accept, except with negligible probability. The public key and the 
certificate is only of poly-logarithmic size and can be constructed by Bob in 
polynomial time and verified by Alice in poly-logarithmic time. This resolves 
the major open problem posed Micali in the theory of short computationally- 
sound proofs l21)l (i.e., that they exist with only one round of interaction under a 
complexity assumption). This also improves the works of j21)ll6ll5<2^ and also 
of Ergiin, Kumar and Rubinfeld P3 who consider this problem for a restricted 
case of languages in NP with additional algebraic properties. 

It is interesting to point out that the proof of our main result also shows 
that the heuristic approach of Biehl, Meyer and Wetzel P) of combining Prob- 
abilistically Checkable Proofs of Arora at. al., 0 with Single-Database Private 
Information Retrieval of Kushilevitz and Ostrovsky and of Cachin, Micali 
and Stadler |S| is valid, though with additional steps both in the protocol and 
in its proof of security. More specifically, our main result uses a novel combina- 
torial technique to prove computational soundness of our interactive proof and 
we believe this proof is of independent interest. 

Our result also has interesting connection to the PCP theorem Recall 
that in the PCP settings Prover must send to the Verifier an entire proof. Sub- 
sequently verifier needs to read only a few bits of this proof at random in order to 
check its validity. Notice, however, that the communication complexity is large 
(since the Prover must send the entire PCP proof). Moreover, notice that even 
though verifier is interested in reading only a few bits. Verifier must count the 
received bits, in order to decide which bits to read. In our setting, verifier asks his 
question first and then Prover can make his answers dependent on the verifiers 
questions. Never-the-less, we show that even if computationally-bounded Prover 
gets to see the verifiers message before it generates its answer, it can not cheat 
the verifier, even if his answer depends on the question. Moreover, the length of 
the proof is dramatically shorter then the PCP proof. 

Our main result have several implications including a different a method of 
reducing the error in one-round two-prover interactive proofs with small com- 
munication complexity. Our proof also has implications to the generalized PIR 
protocols, where user asks for many bits and wishes to know if there exist some 
database which is consistent with all the answers. We discuss this and other 
implications after we prove our main result. 



^ By very short and easily verifiable we mean that the size of the pnblic-key/certificate 
as well as generation/ verification time is a product of two qnantities: a fixed polyno- 
mial in the secnrity parameter times a term which is poly-logarithmic in the running 
time of the actual remote procedure call. 
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2 Preliminaries 

First, we recall some of the terminology and tools that we use in our construction. 
For definitions and further discussion see provided references. We use standard 
notions of expected polynomial-time computations, polynomial indist ingu isha- 
bility, negligible fractions and hybrid arguments (for example, see Yao |23) and 
the notion of witness-indistinguishability m- , , 

We shall use Single Database Private Information Retrieval (PIR) protocols 
first put forward by Kushilevitz and Ostrovsky m and further improved (to an 
essentially optimal communication complexity) by Cachin, Micali and Stadler 
p]. (In the setting of multiple non-communicating copies of the database PIR 
was originally defined by Chor at. al. |S|.) Recall that Single Database PIR are 
private-coin protocols between user and a database, where database has an n- 
bit string D and a user has an index 1 < i < n. Assuming some computational 
(hardness) assumption |l?St619j . the user forms a PIR question to the database 
based on i and a security parameteiQ A:, which database answers. The answer can 
be “decrypted” by the user to correctly learn the i’th bit of D. The key property 
is that i remains hidden from the database. That is, for any two indexes i and 
j the PIR questions of i and j are computationally indistinguishable (i.e., can 
not be distinguished except with negligible probability, denoted as epiR.) By 
PIR{cc, u, db) we denote any one-round single database PIR protocol with com- 
munication complexity cc(n, k) the running time of the user (to both generate 
the question and decode the answer from the database) to be u{n, k) and and 
running time of the database to be d&(n, k) (where db{n, k) must be polynomial 
in n and k.) 

We remark that by standard hybrid argument, poly-size vectors of PIR 
queries are also indistinguishable. We also point our that both HH| and jSj 
are one-round protocols, unlike P2|. Furthermore, we will also use Secure-PIR 
(SPIR) protocols in a single database setting (see Gertner at al. ^31 and Kushile- 
vitz and Ostrovsky [I iSj ) and one-round SPIR protocol (bas ed on any 1-round 
PIR protocol and one-round OT) of Naor and Pinkas The latter result 
shows how to convert any PIR protocol to a SPIR protocol using one call to 
PIR and an additive polylogarithmic overhead in communication. 

We will use Probabilistically Checkable Proof (PCP) proofs of Arora at al. 
0 and 3-query PCP proofs of Hastad [TJ as well as holographic PCP proofs 
of Babai at al. P| and Polishchuk and Spielman j^^j. Recall that a language in 
NP has a (epcp, k) PCP proof if two conditions hold: For every string x in the 
language there is a polynomially long “proof” (which, given a witness can be 
constructed in polynomial time) which the probabilistic polynomial time Verifier 
V will always accept. For every string not in the language there is no proof that 
such a V will accept with probability greater than tpcp- Furthermore, V reads 
only at most k bits. Additionally, we shall need Zero-Knowledge PCP of Kilian 
at al. piT) . 



3 Problem Statements and Our Results 

We state our problem in two settings, the setting of proving an NP statement 
and the setting of efficient verification of remote procedure calls. 

® By k = k(-) we denote a sufficiently large security parameter. That is, k(-) is a 
function of n so that for sufficiently large n, the error probability is negligible. In 
PIR implementations (i.e., f l isitiij i depending on a particular assumption, k{n) is 
typically assumed to be either log‘^b)(|^|) qj- for e > 0. 
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In the setting of proving an NP statement, both prover and verifier are prob- 
abilistic poly-time machines. They both receive as as a common input (i.e., on 
their work-tapes) an input x and a security parameter k. Here, since x is given 
on the work-tape of the verifier, we do not charge the length of x as part of 
communication complexity, nor do we charge the time to read x, as it is given 
before the protocol begins. Additionally Prover receives (if a; £ L) as a private 
input a witness w that x € L. W.l.o.g. we assume that |w| > |a;| (we stress 
here that the interesting case, and the one that we also handle, is the one where 
\w\ » |a:|). 

THEOREM 1 Assume that one-round PIR{cc,db,u) scheme exist. Then, there 
exists (P,V) one-round computationally-sound proof with perfect completeness and 

computational soundness, so that: the communication complexity is 0(log*^*-^^ |w| • 
cc{\w\^^'^\ fc)); prover running time is 0(|r(;|^(^) -|- db{\w\‘^^^\ fc)); and verifier run- 
ning time is 0(log‘^^^^ |w| • fc)). 

If we use Cachin at al. implementation of PIR based on (((-hiding assumption 

p], and k{x) = log*^*-^^ |a;| for sufficiently large x, then we can achieve both poly- 
logarithmic communication complexity and poly-logarithmic user’s computation: 

COROLLARY 1 Assume that (((-hiding assumption holds. Then, for any e > 0 
there exists (P,V) one-round computationally-sound proof with perfect complete- 
ness and computational soundness, so that for x sufficiently large, the total com- 
munication complexity is 0(log‘^^^^ |w|); prover computation is |^|0(l) 

and verifiers 

computation is (^(log*^^^^ |w|). 

If we use Kushilevitz and Ostrovsky implementation of PIR based on quadra- 
tic residuosity assumption HSl, and k{x) = for e > 0 and x sufficiently large, 
then we get the following: 

COROLLARY 2 Assume that quadratic residuosity assumption holds. Then, 
there exists (P,V) one-round computationally-sound proof with perfect completeness 
and computational soundness, so that for x sufficiently large, the total communica- 
tion complexity is |w|'; prover computation is and verifiers computation is 

|w|'^, for any e > 0. 

We remark that using holographic proofs we can make the running 

time of the prover in the above theorem and in both corollaries nearly- linear. 

We also show how we can add witness-indistinguishability property to our 
low-communication complexity protocol by combining Zero-Knowledge PCP HH) 
with one-round SPIR protocols That is, we modify our results above 

to have witness indistinguishability property: 

THEOREM 2 Assume that one-round PIR{cc,db,u) scheme exist. Then, there 
exists (P,V) one-round witness-indistinguishable computationally-sound proof with 
perfect completeness and computational soundness, so that: the communication 
complexity is 0(log‘^^^^ |i(;| • fc)); prover running time is 0(|t(;|'^^^^ -I- 

db{\w\^^^\ k))', and verifier running time is 0(log^^^^ |w| • k)). 



Our second setting is that of verification of any remote procedure call. Here, 
Alice has remote procedure call II (an arbitrary code with an arbitrary input) 
and a polynomial bound t on the running time. Alice generates a public-key and 
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a private key, keeps the private key to herself and send the public-key, II and 
t to Bob. Bob executes y II for at most t steps {y is defined as _L if the 
program II did not terminate in t steps). Bob using public-key also computes 
a certificate c that y is correct and sends c and y back to Alice. Alice using 
her public- key and private-key decides to either accept or reject y. We say that 
the remote procedure call verification is correct if two conditions hold: if Bob 
follows the protocol then Alice always accepts; on the other hand no poly-time 
cheating Bob’ can find a certificate for an incorrect y' on which Alice will accept, 
except with negligible probability. Clearly, Alice must take the time to send II 
to Bob and to read y. W.l.o.g. we assume \y\ = |il| < t. However, if \y\ « t, 
the question of fast verification of y becomes important. We show that Alice can 
do so in an efficient manner: 

THEOREM 3 Assume that one-round PIR{cc,db,u) scheme exist. Then, there 
exists correct verification of any remote procedure call {II, t) such that the running 
time of Bob is + db{t^^^\k))\ the size of public key, private key and the 

certificate are 0(log‘^^^^ t • cc{t^^^\k)), and the running time of Alice is 0{\y\ + 
log'^(^) t ■ u{t^^^\k)) . 

We stress that theorem El holds for any one-round implementation of PIR 
protocol. For example, if we use Cachin at al. implementation of PIR, based 
on 0-hiding assumption E|, and k{t) = log*^*-^^ t for sufficiently large t, then we 
achieve poly-logarithmic boun ds for the verification of y. We again remark that 
using holographic proofs EE3, we can make the running time of the prover in 
the above theorem nearly-linear. Combining, we achieve: 

COROLLARY 3 Assume that <))-hiding assumption holds. Then, for any e > 0, 
there exists correct verification of any remote procedure call {II, t) such that for 
t sufficiently large, the running time of Bob is the size of public key, 

private key and the certificate are all 0(log‘^*-^^ t), and the running time of Alice is 

0(|y| -f log'^^^^t). 

We remark that if 7T is a probabilistic computation, the above theorems 
also hold. Moreover, the coin-flips of Bob can be made witness-indistinguishable, 
similar to theorem |3 

4 Constructions 

First, we describe our construction for theorem 01 Recall that the Prover is 
given X and a witness w. Let \PCP{x,w)\ = N he Hastad’s 3-query PCP proof 
that X G L m- We remark that the use of of Hastad’s version of PCP is 
not es sentia l in our construction, and we can replace it with any holographic 
proofs 1 312 21 that achieves negligible error probability, and thus we can improve 
improve the running time of the prover to be nearly-linear. We choose to use 
Hastad’s version of PCP since is somewhat simplifies the presentation of our 
main theorem. However, we stress that permuting PCP queries is essential in 
our proof. 

In the construction of theorem El simply note that Prover can write down 
the trace of the execution of the procedure call, which serves as a witness that 
the output is correct, and the same construction applies. The corollaries nm 
and El follow from our construction by simply by plugging in the appropriate 
implementation of PIR protocols. 
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V : For j = 1 , , log^ N do: 

— Choose (ii,i2,*3)j £ according to Hastad's PCP Verifier. 

— Choose a random permutation aj over {ii, 12,^3} ■ 

Let {i[,i' 2 ,i 3 )j = 12, *3)- 

— Compute 3 independent PIR encodings of 

PIRjA) PIRjA) (i.e. 3 queries for retrieval from A^-bit database, 
each having its own independent encoding and decoding keys) 

V -> P: For 1 < j < log^ send PIRj{i[) PIRjA) PIRjA) 

P— >■ V: Prover computes Hastad PCP proof on (x,w) treats it as an A-bit 

database, evaluates Slogan PIR queries received from the verifier and 
sends the PIR answers back to the Verifier. 

V: For 1 < j < log^ n PIR-decode the answers to PIRj{i\), PIRjA), 

PIRjA), apply a~^ to get the answers to (11,^2,43)2. If for all 1 < j < 
log^ n, Hastad’s PCP verifier accepts on answers to PCP queries (fi, 42 , 13)3 
then accept, else reject. 



Next, we describe the construction of theorem |21 which also achieves witness- 
indistinguishability. We shall use Zero-Knowledge PCP and one-round SPIR. 
Again, we denote \ZKPCP{w,x)\ = N (recall that N is polynomial in |t(;|). 

This time by j we denote 0(log*^*-^^ N) queries needed by ZKPCP to achieve 
negligible error probability while maintaining super-logarithmic bound on the 
number of bits needed to be read by the ZKPCP verifier to break the Zero- 
Knowledge property of the PCP m- 



V: 



V^P: 

P^V: 



V: 



Choose ii, . . . ,ij G [ 1 , A] according to ZKPCP Verifier; Choose a random 
permutation a over { 41 , . . . , ij}; Let (ij, . . . , i() = cr(4i, . . . , 4j); Compute 
j independent SPIR encodings SPIRi{i[), . . . , SPlRj{i;j ) (i.e. j queries 
for retrieval from A-bit database, each having its own independent PIR 
encoding and decoding keys); 

Send SPIRi A),..., SPIRj A))- 

Prover computes ZKPCP proof on (x,w) and treats it as an A-bit 
database, evaluates j received SPIR queries on it and sends SPIR answers 
back to the Verifier. 



Decode j SPIR answers to SPIRi{i'i ), . . . , SPIRj{i'A), apply 
the answers to 4i,...,42 queries and check if ZKPCP verifier 
answers to fi, . . . ,4^. If so accept, else reject. 



O' ^ to get 
accepts on 



5 Proof of Theorem [T] 

In order to prove Theorem ^ we need to prove completeness and soundness. We will 
prove them in that order. 

The proof of completeness follows from our construction, i.e. there exist algorithms 
for C and (honest) S such that for every triple /, x, y where / is a polynomial time 
computation, there is a certihcate which is the set of correct answers for every query 
set from a PCP verifier. The PIR encoding of this query set can be efficiently computed 
by C. Similarly, the correct PCP answers to the query set and their PIR encoding can 
also be computed efhciently by S. There are log*^^^^ n PIR queries each of which can 
be encoded in the |lij construction using log'^b) bits giving a total communication 
complexity of log'^^'^^ n bits both in the query and the response. 
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The intuition of the soundness proof is that even though a cheating server may 
use a different proof string for each query response, we can construct an “emulator” 
algorithm for this server that flips its coins independent of the server and the PIR 
encoded queries, selects one “proof” string according to some probability distribution, 
and responds to queries honestly according to this string. Most of the work in the 
proof will be in showing that the induced distribution on the query responses from the 
emulator is close to that of the server and hence the error bounds of the PCP theorem 
apply with some small slack. Recall that N is the length of the PCP proof string and 
let [N] represent {1, 2, . . . , N}. tpcp is the acceptance error of the PCP verifier using 
I queries and epip is the error probability of the PIR scheme used. Let I 2 , ■ ■ ■ , h 
be random variables representing the I queries over [N] asked by the verifier V and 
Bi, . . . , Bihe the random variables representing the decoded bit responses of the server. 
Let = Pr[Bi = bi . . . Bi — bi\Ii — , h = ii\ be the probability distribution 

of the server’s responses to queries where the choice is over the server’s coin flips and 
PIR-encodings. 

Given the distribution , we will construct an emulator algorithm which prob- 

abilistically (using its own coin flips and independent of the server and queries) chooses 
a proof string and answers the queries according to this string. Furthermore, the dis- 
tribution induced on the emulator’s responses by this choice of string will be close to 
^b\ ' h\ ■ show that the PCP error bound applies to the emulator’s responses and 

thence that the error bound will also apply to the server’s responses but with some addi- 
tional slack. Let the emulator have a probability distribution Q on A^-bit strings and let 
Bi . . . Bihe random variables representing the bits of the proof string chosen according 
to Q. Q induces a probability distribution on the emulator’s responses to queries. De- 
note this induced distribution by = Pr[Bi — bi . . . Bi — bi\Ii = ii, . . . , Ii = ii]. 

Define eg := maxij...i,, 6 j,,. 6 j IP^i 6 * “-^ 616 ^ I- First, we note that the emulator’s prob- 
ability of acceptance by the PCP verifier is bounded by epcp- 

CLAIM 1 Pr[V{h,...,h-,Bi...Bi) = 1] < epcp. 

Proof: Pr[V(7i , . . . , Ip, B^ . . . Bi) = 1] = Ed 6 {o,i}Jv P^Vih,- • • , 7; di, .../,) = 1\D = 
d]Pr[D = d] where di^ denotes the Ij-th bit of d. From the PCP Theorem, for all d, 
Pr[V(7i , . . . ,li\di^ . . ./,) = 1] < epcp. Applying this to the previous equation we get, 
Pr[V(7i , . . . ,li; Bi . . . Bi) ^ 1] < Edg{o,i}« f^PCpPr[D = d] = epcp- □ 

Next, we note that the server’s probability of acceptance by the Verifier is close to 
that of the emulator. 

LEMMA 1 Pr[V(7i , . . . , 7;; 73i . . . 7?i) = 1] < epcp -|- 2*es. 

Proof: From the PCP Theorem, Pr[V(7i, . . . , 7;; Bi . . . .B;) = 1] < tpcp- Let ii, ... ,ii G 
[A] be query instances and bi . . .bi be bit strings. Using conditional probabilities we 
can express the LHS as a sum over all query instances and response bit strings. Using 
IHl'. '. '.b, ~ Pbl. '.l\ I ^ and collapsing the sums we get, Pr[U(7i , . . . , Ip, Bi . . . Bi) = 
1] < Pr[U(7i , . . . ,IpBi . . . Bi) = 1] + es E/i.../, Y.bi...bi - ^p^ip + 2*es- □ 

Next, we show the existence of an emulator whose response distribution is close to 
the server’s. Note mere proof of existence suffices. 

LEMMA 2 There exists an “emulator” algorithm such that es < ep/fl • 

For clarity of exposition, we write the proof using only 3-tuple queries and provide the 
calculation for I > 3 queries at the end. First, we make a simple claim which follows 
from the fact that the client chooses a random permutation of the 3-query, hence the 
server strategy has to be oblivious of the order of elements within the 3-tuple. Thus, 
w.l.o.g. we only need to consider the distributions B®’f A ^here l<i<j<k<N. 
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CLAIM 2 For all permutations cr on {1, 2, 3} and all 3-tuples ii, i 2 , is, and all decodings 

bi,b2,b3, 



^ 6l ,&2 )t>3 



>^ct( 2 ) ’*o-( 3 ) 



We want an emulator whose probability distribution Q over A-bit strings is consis- 
tent with the distribution of the server’s responses. Formally, if prob- 

ability distribution induced by Q on indices ii,i2,is, then we want Vii, *2, is, 6i, 62, &s 
~ ^6162^ true. What we will actually show that there is a emulator 

for which this is true with small error, i.e. moa;iii2*3,*>ii>2(>3 “ ^616263! — 

To show the existence of such an emulator, we write out the equations that the above 
equality implies on the actual proof strings that Q is over. 

Construct matrix A with rows and 2 ^ columns as follows: the rows of A 

are labeled by the different query-response tuples *>3 columns are labeled 

by the 2 ^ possible A-bit strings. Let B be the vector of of values 5^ from 

the server’s strategy. Let x be the probability vector with 2 ^ entries where Xi is the 
probability that the i-th A bit string is chosen as the database. To prove Lemma |21 it 
is enough to show that the system Ax = B can be solved for the probability vector x. 
Any such solution can be used for the distribution Q. 

For clarity, we label the 8^^^ rows of matrix A in lexicographic order over the tuples 
and in increasing order of weight over bit strings rooo, t'ooi Aoio Aiooi *'oii ■ • 
and the 2 ^ columns of matrix A by subsets of [A] enumerated in lexicographic order. 
That is, let 0, {!}, {2}, . . . , {A},{ 1 , 2}, { 1 , 3 }, . . . , {A — 1 , A}, . . . , { 1 , . . . , A} represent 
the labels of the colurnns of A in order. Now we can describe how the elements of A 
can be filled: let I = be the label of the row in question and let be the 

actual vector. Let J C {!,..., A} be the label of the column, then set A\I, J] = 1 if 
Dj\I] = 0 , i.e. if the J-th string has a zero in its J-th position. Clearly, if there is a 
solution X which is a probability distribution on A-bit strings, then the emulator using 
this distribution can generate the same distribution as the server on query-response 
tuples. 

In order to prove that a solution exists, we now define a series of row transformations 
on A in lexicographic order. For the first row 7 ?ooo there is no change. For the next 
row, rooii "’S S'dd the previous row to this one. That is, the second row with label rJoi 
is now i?ooo + ^001- Define i?Q q as the A bit vector which is defined as: o[^] — 1 if 

the Lth column label contains i and j. Note that 7 ?o 0 0 Poo^i both be 1 

in any index. Hence, the row with label rooi has the row Rqo- Next, for the row with 
label roio, we add the row labeled rJoo to get the row R^q and similarly, for the row 
labeled r^oO' Finally, for the rows labeled we add the rows labeled roQo, rpoi 
rgio- Defining Rq as the vector with 1 in all the positions where the i-th bit is 1 , we 
note that the row labeled rgff has the row Rq. Analogously, we transform the rows 
labeled rioi to Rq and to Rq. Finally, the row labeled can be transformed to 
the all I’s vector by adding to it all the previous rows. Next in the lexicographic order 
are the rows labeled Vqqq through rl^f. We follow the same procedure outlined above 
for these rows to get the rows i^ggo, . . . ,Rq,1. Since every one of the rows in the 
matrix has a label of one of these forms we can carry out this procedure for all the 
rows of A to give us a new matrix A! . 

In order that the solutions to the system Ax = B remain unchanged when we 
transformed A to A', we have to perform the same set of row operations on B. To start 

1*^3 1Q3 1221 12 

with, consider Bqqi- To be consistent with A, we add -Booi t-o give Bqq ' . Similarly, 

we add Bqqq to Hgig. Let this sum be called Bgg^'^^ and analogously for Hgg^'^^. Next, 
for Bgii , we add Bqqq, Bqqi and Bgig to get Bq^^^^. Similarly, we get and 

Finally, replace BJ^J by the sum of all the eight quantities i.e. X/616263 ^6162(13 ■ Follow 
this procedure for all the values of B to give a new vector B'. 
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CLAIM 3 The systems Ax = B and Ax' = B' have exactly the same set of solutions. 



Proof: We can write the row transformations that we have performed as a matrix T 
that left multiplies both sides. Since we have performed the same transformations on 
B as on A, we get TAx = TB. Trivially, T is an invertible matrix. □ 

Now, we proceed to show that the system A' x = B' is solvable , i.e. there is at least 
one solution which is a probability. As is obvious from the construction of A', all rows 
are duplicated many times. Hence A' is not full rank and we have to show that B' is 
in the column space of A' . We do this by collecting all the unique rows and reordering 
the rows of matrix A' so that it becomes upper triangular. Since, we want A' to be 
upper triangular, we must dovetail the row order to the column order that we have 
already established. Hence, we list the rows in the following order now: the first row 
will be the vector which is the all ones vector. Next, we list in order 7?oi * = 1, • • • , A" 
(only one copy each). Consider the elements of the row R\. Recall that the columns 
of A' (as were the columns of A) are labeled as 0, {!}, {2}, . . . , {N}, {!, 2}, . . .. Hence, 
Rq is of the form 0, 1, . . . whereas Rq has the form 0, 0, 1, . . . and so on. Next, we list 
in order one copy each of Rqq. Similarly, Rq’q will have N + I leading zeros and then 
a 1 in the column labeled {1,2}. One can similarly write the elements of the rows of 
the form Rg^g q. We have accounted for I — 1 + rows and by listing 

the rows in this order gives an upper triangular part of A'. To preserve equivalence, 
we have to reorder the elements of B' in the same way. The first element of B' is 
1. The second element in B' (corresponding to row is Similarly, the next 

N —1 elements are through i.iv|jv next set of elements are Pqq^'^^ 

through 

The remaining — (1 + + (^)) rows are all duplicates of the rows in 

the upper triangle since our transformation from A to A! transformed all the rows. For 
example, the row Rgg will be repeated N — 2 times, once for every k G [N], k ^ i ^ j. 
Similarly, Rg will be repeated once for every pair j, k G [N], i ^ j ^ k. This gives us 
an easy construction of the left null space Lj^i of A'. For every i-th row not in the 
upper triangle, we know that it is a duplicate of some j-th row in the upper triangle. 
We can construct a vector in Lj^i which has a 1 in the i-th position and —1 in the ji-th 
position. This construction describes all the vectors in La> ■ Now, it is sufficient that B' 
be orthogonal to all the vectors in La', i-e. Vy € pA'iy, B') — 0. By our construction, 
every such y has a —1 in one of the first t elements and a +f in one of the remaining 
positions. This gives us the following equality constraints between elements of B' . The 
duplicates of the rows in A' of the form R'gg give constraints of the form: 

Vi, j, k, k' G [A], i ^ j ^ k ^ k' , P'gll + P^l = Pg>gg + P'gl\ . 

All the duplicate rows in A' of the form Rg give constraints of the form 
Vi, j, f, k, k' &[N],i^ j ^ 3 ^k^ k' 



pijk , pijk . pijk 

-* 000 "C -*001 "C 



I pb'fc _ I pij'k' 

"T J on ^ -" 000 "c -'01 



I pij'k' 

"T nnio 



I pij'k' 

"T nni 1 . 



Hence, if these constraints are true on B' then B' is in the column space of A' . 

Define projections of the probability distributions P'l'^n g^g follows. Let 
~ + ^ 61 ^ 62 1 ^’ Similarly, define projections of the form Pq'. The following claim 

follows from the assumption of zero error probability for PIR. It says that the server’s 
response on a particular query index cannot depend on the tuple. 



CLAIM 4 Let t be any single query or two query indices and let s and s' be any three- 
query tuples containing t. Then, if epiR = 0, P"^'' = P" 
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By the claim above, projection probabilities are well defined. Let — ps|»i ,*2 

for any s which contains ii and i 2 . Likewise, is defined analogously. 

LEMMA 3 The system A'x = B' yields a valid set x of probabilities on A-bit strings 
\A/hen epiR = 0. 

Proof: We have shown above that the system A'x = B' has a solution when B' satisfies 
the projection constraints above. Indeed, that these constraints are satisfied follows 
immediately from the assumption of zero-error PIR. However, we still have to ensure 
that the solutions we have are probabilities. The first row ensures that all the elements 
of the solution vector sum up to 1. However, it is not obvious that the elements are 
non-negative. To see this, we break the solution to the upper triangular system into 
blocks. An {i,j) block is a subset of rows consisting of the row Pqq and all rows of 
the form Note that Pqoo is already non- negative since it is a given probability. 

The constraints on imply that Pq^ = ^ooo + ^ooi all k ^ i ^ j. Furthermore, 
the rows Pq^q has no common 1 positions with any row except rows of the form Pqq^. 
From this we can infer that a non-negative solution exists in this block. We can follow 
the same argument for all such (i,j) blocks since they are independent. Finally, by an 
analogous argument, it can be shown that the i blocks can be solved with non- negative 
solutions as well. □ 

It now remains to consider the case of epiR > 0. First, we note that permutations of 
a query do not give any advantage to the adversary since we chose a random permuta- 
tion of any query. However, the projection constraints may not hold exactly, and hence 
existence of a solution to A'x = B' is not a given. A PIR scheme with epiR > 0 implies 
that the projection constraints are satisfied with error bounded by epiR. That is, for 

example, \ti ^ j ^ k ^ k' £ [A] IPggg + - (Pg^g -|- Pg^J )| < epiR. The following 

lemma shows that there is a solution P that satisfies all the projection constraints and 
is close to the given probabilities P. 



LEMMA 4 There exists a probability vector P^^ " 5 “ such that the system Ax = P is 
solvable and maxi 



jijk 









Proof: To create the vector P of probabilities we first compute Pgg := X/ 

where := Similarly, Pi := with 






pijk\i analogously. We now show how to compute P^.^f. 



kjii^j - 

from these values. 



There are 8 values to be calculated Pggg through Pi{i- Start with Pggg = 



the values Pgg and Pg we can compute all the 



jijk 



f’ooo- Given 



jection constraints exactly as follows. Pgg* := Pgg — Pggg 



bijk 



bijk 



such that they satisfy the pro- 
Similarly, we can compute 



TDJk _ pijft rpv pijk pi _ ipijk , pijk I pijft', 

ana Pj^gg .— pqq ^000- .— -tq IPdoo ' ^001 ^010 J 



and so on. Finally, we have to show that the probabilities P are within some e of P. 
We can compute the distance by tracing the computation path of these probabilities 
above. The error bound depends on the number of queries I asked and so we state this 
last bound in terms of 1. 



CLAIM 5 For any client making I queries, there exists a probability vector P satisfying 
the projection constraints such that for all and bi...bi, \Pl^"'l‘ ~ Pp < 

This claim completes the proof of LemmaEl We omit the simple proof for lack of space. 
Proof of Theorem 0 Given a cheating server whose acceptance probability for a lan- 
guage L £ NP is > epcp, using Lemmas [DandEl there is an emulator whose acceptance 
probability is > tpcp + tpiR ■ epcp is already negligible by assumption. The 

second term can be made negligible by increasing the security parameter in the PIR 
scheme by a poly-logarithmic factor. □ 
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6 Extensions and Further Results 

We briefly sketch the proof of TheoremRl It is straightforward to see that Theorem 
carries all the properties of Theorem flusing essentially the same proof (i.e. showing 
that one can construct probability distribution on databases which will pass PCP 
theorem) but converting the PCP proof into a ZKPCP proof and using PIR instead of 
SPIR. To prove witness-indistinguishability, we now assume that the Prover is honest, 
and that there is a poly-bounded Verifier who wants to distinguish witness Wi from 
witness W 2 - SPIR guarantees that the verifier will not be able to read more than allowed 
by our protocol log'^^^^ N bits and by the properties of ZKPCP the bits retrieved are 
computationally indistinguishable for wi and W 2 - Hence, if the Verifier can d istin guish 
this violates either SPIR security or the zero-knowledge property of ZKPCP |1 7) . This 
sketch will be elaborated in the full version of the paper. 

The proof of Theoremflalso has implications for one-round multi-prover proof sys- 
tems in the information-theoretic setting. Notice that we show in Theorem P how to 
combine PIR protocols with PCP in one roun d. In the multi-prover setting, information- 
theoretic implementations of PIR are known m with small communication complex- 
ity. Our theorem provides an alternative way to reduce the communication complexity 
in the multi-prover setting: all provers can write down a PCP proof on their work 
tapes, and then the communication complexity is simply the cost of retrieving poly- 
logarithmic number of bits using information-theoretically secure PIR protocols. So far, 
the best bounds known for retrieving a single bit using info-theoretic PIR with only 

two provers is 0{N3) jS|. Thus, our approach gives an inferior result to uni- However, 
our technique works with any multi-database PIR protocol and thus an improvement 
in information-theoretic PIR protocols would improve the communication complexity 
in our approach. We remark as an aside that our results also hold in the setting of 
universal service provider PIR |S]. 

Note that our “emulation” technique is not PCP-specific. More specifically, any 
adversary who does not answer PIR queries honestly can be emulated (with small 
error) by an algorithm that flips coins independently of the PIR encodings to choose a 
database which it then uses to answer the queries honestly. 

We also wish to point out that the same Verifier’s message can be used for multiple 
proofs to multiple provers, and they all can be checked for correctness (with the same 
error probability) without any additional messages from the verifier. As long as the 
Verifier does not reveal any additional information about his private key or whether 
he accepted or rejected each individual input, the error probability holds. 
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Abstract. A new unfolding approach to LTL model checking is pre- 
sented, in which the model checking problem can be solved by direct 
inspection of a certain finite prefix. The techniques presented so far re- 
quired to run an elaborate algorithm on the prefix. 



1 Introduction 

Unfoldings are a partial order technique for the verification of concurrent and dis- 
tributed systems, initially introduced by McMillan El- They can be understood 
as the extension to communicating automata of the well-known unfolding of a 
finite automaton into a (possibly infinite) tree. The unfolding technique can be 
applied to systems modelled by Petri nets, communicating automata, or process 
algebras 1 1 )) . It has been used to verify properties of circuits, telecommuni- 

cation systems, distributed algorithms, and manufacturing systems fp. 

Unfoldings have proved to be very suitable for deadlock detection and invari- 
ant checking HH. For these problems, one first constructs a so-called complete 
prefix 0, a finite initial part of the unfolding containing all the reachable states. 
This prefix is at most as large as the state space, and usually much smaller (often 
exponentially smaller). Once the prefix has been constructed, the deadlock de- 
tection problem can be easily reduced to a graph problem HH, an integer linear 
programming problem m, or to a logic programming problem 0. 

In and jl Yll h) . unfolding-based model checking algorithms have been 
proposed for a simple branching-time logic and for LTL, respectively. Although 
the algorithms have been applied with success to a variety of examples, they are 
not completely satisfactory: After constructing the complete prefix, the model 
checking problem cannot be yet reduced to a simple problem like, say, finding 
cycles in a graph. In the case of LTL the intuitive reason is that the infinite 
sequences of the system are “hidden” in the finite prefix in a complicated way. 
In order to make them “visible” , a certain graph has to be constructed. Unfor- 
tunately, the graph can be exponentially larger than the complete prefix itself. 

* Work partially supported by the Teilprojekt A3 SAM of the Sonderforschungsbereich 
342 “Werkzeuge und Methoden fiir die Nutzung paralleler Rechnerarchitekturen” , 
the Academy of Finland (Project 47754), and the Nokia Foundation. 
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Niebert has observed m that this exponential blow-up already appears in a sys- 
tem of n independent processes, each of them consisting of an endless loop with 
one single action as body. The complete prefix has size 0{n), which in principle 
should lead to large savings in time and space with respect to an interleaving 
approach, but the graph i s of size 0(2"), i.e. as large as the state space itself. 

In this paper we present a different unfolding technique which overcomes this 
problem. Instead of unrolling the system until a complete prefix has been gener- 
ated, we “keep on unrolling” for a while, and stop when certain conditions are 
met. There are two advantages: (i) the model checking problem can be solved by 
a direct inspection of the prefix, and so we avoid the construction of the possibly 
exponential graph; and, (ii) the algorithm for the construction of the new prefix 
is similar to the old algorithm for the complete prefix; only the definition of a 
cut-off event needs to be changed. The only disadvantage is the larger size of the 
new prefix. Fortunately, we are able to provide a bound: the prefix of a system 
with K reachable states contains at most events, assuming that the sys- 

tem is presented as a 1-safe Petri net or as a product of automatatJ. Notice that 
this is an upper bound: the new prefix is usually much smaller than the state 
space, and in particular for Niebert’s example it grows linearly in n. 

The paper is structured as follows (for detailed definitions and proofs see 
the full version |S|). Section 0 presents the automata theoretic approach to LTL 
model checking. In Sect. 0the unfolding method is introduced. Sections E] and 
0contain the tableau systems for the two subproblems. In Sect. Elwe show how 
LTL model checking can be solved with the presented tableau systems. In Sect. [T] 
we conclude and discuss topics for further research. 

2 Automata Theoretic Approach to Model Checking LTL 



Petri nets. We assume that the reader is familiar with basic notions, such as 
net, preset, postset, marking, firing, firing sequence, and reachability graph. We 
consider labelled nets, in which places and transitions carry labels taken from a 
finite alphabet C, and labelled net systems. We denote a labelled net system by 
S = {P,T, F,l, Mq), where P and T are the sets of places and transitions, F is 
the flow function F: {P x T) D (T x P) — >■ {0, 1}, 1: P UT — >■ £ is the labelling 
function, and Mq is the initial marking. 

We present how to modify the automata theoretic approach to model check- 
ing LTL 1 1 5] to best suit the net unfolding approach. For technical convenience 
we use an action-based temporal logic instead of a state-based one, namely the 
linear temporal logic tLT L' of Kaivola, which is immune to the stuttering of in- 
visible actions |0| . With small modifications the approach can also handle state 
based stuttering invariant logics such as LTL-X. Given a finite set A of actions, 
and a set V C A of visible actions, the abstract syntax of tLTL' is given by: 

(/?::= T I -up I V (/?2 I £i W £2 I £i £ 2 , where a & V 
^ More precisely, the number of non-cut-off events is at most 0{K^). 
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Formulas are interpreted over sequences of . The semantics of (/?i lA 1^2 is 
as expected. Loosely speaking, a sequence w satisfies 1^2 if V^i holds until 
the first a in w, and then </?2 holdfl 

Given a net system E = {P,T, F,l, Mq), where the transitions of T are la- 
belled with actions from the set A, and a formula (p of tLTL' , the model checking 
problem consists of deciding if all the infinite firing sequences of E satisfy (/?. 

The automata theoretic approach attacks this problem as follows. First, a 
procedure similar to that of jS| converts the negation of if into a Biichi automaton 
over the alphabet T = UU{t}, where t ^ A is a new label used to represent 
all the invisible actions. Then, this automaton is synchronized with E on visible 
actions (see jS] for details). The synchronization can be represented by a new 
labelled net system E^^p containing a transition (u, t) for every u = q — - — >■ q' in 
and for every t € T, such that l{t) = a and a € V, plus other transitions for 
the invisible transitions of E. We say that (u, t) is an infinite-trace monitor if q' 
is a final state of A-,p, and a livelock monitor if the automaton A-,p accepts an 
infinite sequence of invisible transitions (a livelock) with q' as initial state. The 
sets of infinite-trace and livelock monitors are denoted by / and L, respectively. 
An illegal Lo-trace of 17 -,^ is an infinite firing sequence Mq — — y such that 
ti € I for infinitely many indices i. An illegal livelock of E^^ is an infinite firing 

sequence Mq — >■ M ^ such that ti G L, and ty+k & {T\V) 

for all fc > 1 . We have the following result: 

Theorem 1. Let E he a labelled net system, and ip a tLTL' -formula. E \= ip if 
and only if E^^, has no illegal oj-traces and no illegal livelocks. 

The intuition behind this theorem is as follows. Assume that E can execute 
an infinite firing sequence corresponding to a word w G (U U violating ip 

(where ‘corresponding’ means that the firing sequence executes the same visible 
actions in the same order, and an invisible action for each r). If w contains 
infinitely many occurrences of visible actions, then E^^p contains an illegal uj- 
trace; if not, it contains an illegal livelock. 

In the next sections we provide unfolding-based solutions to the problems 
of detecting illegal w-traces and illegal livelocks. We solve the problems in an 
abstract setting. We fix a net system E = (P, T, F, Mq), where T is divided into 
two sets V and T\V of visible and invisible transitions, respectively. Moreover, T 
contains two special subsets L and I. We assume that no reachable marking of E 
concurrently enables a transition of V and a transition of L. We further assume 
that Mq does not put more than one token on any place. In particular, when 
applying the results to the model checking problem for tLTL' and Petri nets, the 
system E is the synchronization E^,p of a Petri net and a Biichi automaton, and 
it satisfies these conditions. We use as running example the net system of Fig. Q 
We have V = {te}j I = {^i}j and P = {^2}. The system has illegal w-traces (for 
instance, (titstAetr)'^), but no illegal livelocks. 

^ Kaivola’s semantics is interpreted over A* U A“, which is a small technical difference. 



478 



J. Esparza and K. Heljanko 



pi p2 




Fig. 1. A net system 



3 Basic Definitions on Unfoldings 

In this section we briefly introduce the deflnitions we need to describe the un- 
folding approach to our two problems. More details can be found in 

Occurrence nets. Given two nodes x and y of a net, we say that x is eausally 
related to y, denoted by a: < y, if there is a path of arrows from x to y. We say 
that X and y are in conflict, denoted by x#y, if there is a place z, different from 
X and y, from which one can reach x and y, exiting z by different arrows. Finally, 
we say that x and y are concurrent, denoted by x co y, if neither x < y nor y < x 
nor xf^y hold. A co-set is a set of nodes X such that x co y for every x,y G X. 
Occurrence nets are those satisfying the following three properties: the net, seen 
as a graph, has no cycles; every place has at most one input transition; and, no 
node is in self-conflict, i.e., xf^x holds for no x. A place of an occurrence net is 
minimal if it has no input transitions. The net of Fig. 0 is an infinite occurrence 
net with minimal places a, h. The default initial marking of an occurrence net 
puts one token on each minimal place an none in the rest. 

Branching processes. We associate to if a set of labelled occurrence nets, called 
the branching processes of E. To avoid confusions, we call the places and transi- 
tions of branching processes conditions and events, respectively. The conditions 
and events of branching processes are labelled with places and transitions of S, 
respectively. The conditions and events of the branching processes are subsets 
from two sets B and E, inductively defined as the smallest sets satisfying: 

— E G £, where T is an special symbol; 

— \i eG £, then (p, e) G B for every p G P; 

— if 0 C X C S, then (t, X) G £ for every t gT. 

In our deflnitions we make consistent use of these names: The label of a 
condition (p,e) is p, and its unique input event is e. Conditions (p, T) have no 
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input event, i.e., the special symbol _L is used for the minimal places of the 
occurrence net. Similarly, the label of an event {t,X) is t, and its set of input 
conditions is X. The advantage of this scheme is that a branching process is 
completely determined by its sets of conditions and events. We make use of this 
and represent a branching process as a pair (B, E). 

Definition 1. The set 0 / finite branching processes of a net system S with the 
initial marking Mg = {pi, . . . ,p„} is inductively defined as follows: 

— ({(pi, _L), . . . , (p„, _L)}, 0) is a branching process o/bII 

— If (B,E) is a branching process of E, t € T, and X C B is a co-set labelled 
by *t, then ( B U {(p, e) | p £ t*} , BU {e} ) is also a branching process of S, 
where e = (t,X). Ife^E, then e is called a possible extension of (B,E). 

The set of branching processes of E is obtained by declaring that the union 
of any finite or infinite set of branching processes is also a branching process, 
where union of branching processes is defined componentwise on conditions and 
events. Since branching processes are closed under union, there is a unique max- 
imal branching process, called the unfolding of E. The unfolding of our running 
example is an infinite occurrence net. Figure El shows an initial part. Events and 
conditions have been assigned identificators that will be used in the examples. 
For instance, the event (ti, {(pi,T)}) is assigned the identificator 1. 




Fig. 2. The unfolding of E 



This is the point at which we use the fact that the initial marking is 1-safe. 



3 
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Configurations. A configuration of an occurrence net is a set of events C satis- 
fying the two following properties: C is causally closed, i.e., if e S C and e' < e 
then e' G C, and C is conflict-free, i.e., no two events of C are in conflict. Given 
an event e, we call [e] = {e' £ E \ e' < e\ the local configuration of e. Let 
Min denote the set of minimal places of the branching process. A configura- 
tion C of the branching process is associated with a marking of S denoted by 
Mark{C) = l{{Min U C*) \ *C). 

In Fig. 0, {1,3,4, 6} is a configuration, and {1,4} (not causally closed) or 
{1,2} (not conflict-free) are not. A set of events is a configuration if and only 
if there is one or more firing sequences of the occurrence net (from the default 
initial marking) containing each event from the set exactly once, and no fur- 
ther events. These firing sequences are called linearisations. The configuration 
{1, 3, 4, 6} has two linearisations, namely 13 46 and 314 6. All linearisations lead 
to the same reachable marking. For example, the two sequences above lead to the 
marking {pi,pr}. By applying the labelling function to a linearisation we obtain 
a firing sequence of E. Abusing of language, we also call this firing sequence a 
linearisation. In our example we obtain tit^t^tQ and t^titite as linearisations. 

Given a configuration C, we denote by fC the set of events e £ E, such that: 
(1) e! < e for some event e' S C, and (2) e is not in conflict with any event of 
C. Intuitively, fC corresponds to the behavior of S from the marking reached 
after executing any of the linearisations of C. We call fC the continuation after 
C of the unfolding of E. If Ci and C 2 are two finite configurations leading to 
the same marking, i.e. MarkiCi) = M = Mark{C 2 ), then fCi and {<^2 are 
isomorphic, i.e., there is a bijection between them which preserves the labelling 
of events and the causal, conflict, and concurrency relations (see 0). 

4 A Tableau System for the Illegal Lj-Trace Problem 

In this section we present an unfolding technique for detecting illegal w-traces. 
We introduce it using the terminology of tableau systems, the reason being that 
the technique has many similarities with tableau systems as used for instance 
in m for model-checking LTL, or in m for model-checking the mu-calculus. 
However, no previous knowledge of tableau systems is required. 

Adequate orders. We need the notion of adequate order on configurations 0. In 
fact, our tableau system will be parametric in the adequate order, i.e., we will 
obtain a different system for each adequate order. Given a configuration C of the 
unfolding of E, we denote by C©A the set CUA, under the condition that CCE 
is a configuration satisfying CC\E — We say that C © if is an extension of C. 
Now, let Cl and C 2 be two finite configurations leading to the same marking. 
Then fCi and fC 2 are isomorphic. This isomorphism, say /, induces a mapping 
from the extensions of Ci onto the extensions of C 2 ; the image of Ci © if under 
this mapping is C 2 © /(if). 

Definition 2. A partial order -< on the finite configurations of the unfolding of 
a net system is an adequate order if: 
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— ^ is well-founded, 

— Cl <Z C2 implies C\ < C2, and 

— < is preserved by finite extensions; if C\ A C2 and Mark{C\) = Mark{C2), 
then the isomorphism f from above satisfies Ci © E’ A C2 (B f{E) for all 
finite extensions Ci® E of Ci. 

Total adequate orders are particularly good for our tableau systems because 
they lead to stronger conditions for an event to be a terminal, and so to smaller 
tableaux. Total adequate orders for 1-safe Petri nets and for synchronous prod- 
ucts of transition systems, have been presented in m- 



4.1 The Tableau System 

Given a configuration C of the unfolding of E, denote hy ffiC the number of 
events e G C labelled by transitions of I. 

Definition 3 . An event e of a branching process BP is a repeat (with respect 
to if BP contains another event e! , called the companion of e, such that 
Mark{[e']) = Mark{[e]), and either 

(I) e' < e, or 

(II) -.(e' < e), [e'] A [e], and > #/[e]. 

A terminal is a minimal repeat with respect to the causal relation; in other 
words, a repeat e is a terminal if the unfolding of E contains no repeat e' < e. 
Repeats, and in particular terminals, are of type I or type II, according to the 
condition they satisfy. 

Events labelled by I -transitions are called I -events. A repeat e with companion 
e' is successful if it is of type I, and [e] \ [e'] contains some I-event. Otherwise 
it is unsuccessful. 

A tableau is a branching process BP such that for every possible extension 
e of BP at least one of the immediate causal predecessors of e is a terminal. A 
tableau is successful if at least one of its terminals is successful. 

Loosely speaking, a tableau is a branching process which cannot be extended 
without adding a causal successor to a terminal. In the case of a terminal of type 
I, t[e] need not be constructed because which is isomorphic to it, will be 
in the tableau. In the case of a terminal of type II, t[e] need not be constructed 
either, because '\[e'\ will appear in the tableau. However, in order to guarantee 
completeness, we need the condition #/[e'] > #/[e]. 

The tableau construction is straightforward. Given E = (N,Mq), where 
Mo = {pi, ■ ■ ■ ,Pn}, start from the branching process ({(pi, T), . . . , (p„, T)}, 0 ). 
Add events according to the inductive definition of branching process, but with 
the restriction that no event having a terminal as a causal predecessor is added. 
Events are added in ^ order; more precisely, if [e] A [e'\, then e is added before 
e' . The construction terminates when no further events can be added. 
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We construct the tableau corresponding to the net system of Fig. fusing the 
total adequate order of All we need to know about this order is that for 
the events 4 and 5 in Fig. 13 [4] ^ [5] holds. The tableau is the fragment of the 
unfolding of Fig. [^having events 16, 17, and 5 as terminals. Events 16 and 17 are 
terminals of type I having event 4 as companion. Event 16 is successful because 
the set [16] \ [4] = {6,7,10,11,12,16} contains an /-event, namely 10. The 
intuition behind these terminals is rather clear: a terminal of type I corresponds 
to a cycle in the reachability graph. Loosely speaking, the events of [16] \ [4] 
correspond to a firing sequence leading from Marfc([4]) to Mark {[16]), and these 
two markings coincide. Since [16] \ [4] contains an /-event, the firing sequence 
contains a transition of /, and so we have found an illegal w-trace. The set [17] \ [4] 
doesn’t contains any /-event, but t[lf] need not be constructed, because it is 
isomorphic to f[4]. Event 5 is a terminal of type II with event 4 as companion 
because Mark {[4]) = {pe,P 7 } = Mark {[5]), [4] ^ [5], and 1 = #/[4] > ^/[5] = 0. 
The intuition is that t[5] need not be constructed, because it is isomorphic to 
t[4]. However, this doesn’t explain why the condition #/[e'] > #/[e] is needed. 
In jS| we present an example showing that after removing this condition the 
tableau system is no longer complete. 

Let K denote the number of reachable markings of S, and let B denote the 
maximum number of tokens that the reachable markings of E put in all the 
places of E. We have the following result: 

Theorem 2. Let T he a tableau of E eonstrueted aeeording to a total adequate 
order 

— T is successful if and only if E has an illegal oj-trace. 

— T contains at most ■ B non-terminal events. 

— If the transitions of I are pairwise non-concurrent, then T contains at most 

non-terminal events. 



5 A Tableau System for the Illegal Livelock Problem 

The tableau system for the illegal livelock problem is a bit more involved that 
that of the illegal w-trace problem. In a first step we compute a set CP = 
{Ml, . . . , Mn{ of reachable markings of E, called the set of checkpoints. This set 
has the following property: if E has an illegal livelock, then it also has an illegal 

livelock Mg *’^*^'"** — >■ M - **+^**+^ ' — >■ such that ti G L and M is a checkpoint. 
For the computation of CP we use the unfolding technique of 0 or 0; the 
procedure is described in Sect. 10.11 

The tableau system solves the problem whether some checkpoint enables an 
infinite sequence of invisible actions. Clearly, E has an illegal livelock if and 
only if this is indeed the case. For this, we consider the net Wnu obtained from 
N by removing all the visible transitions together with their adjacent arcs. We 
construct unfoldings for the net systems {Ninv, Mi), . . . , (W™, A/„), and check 



^ We can also take the order of 0, which for this example yields the same results. 
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on them if the systems exhibit some infinite behavior. The tableau system is 
described in Sect. EH 

5.1 Computing the Set of Checkpoints 

We construct the complete prefix of the unfolding of S as defined in ^ or |3|. 
In the terminology of this paper, the complete prefix corresponds to a tableau 
in which an event e is a terminal if there is an event e' such that Mark(\e'\) = 
Mark{[e]), and [e'\ A [e]. 

Definition 4. A marking M belongs to the set CP o/ checkpoints of S if M = 
Mark([e\) for some non-terminal event e of the eomplete prefix of S labelled by 
a transition of L. 

Let us compute CP for our example. The complete prefix of S coincides 
with the tableau for the illegal w-trace problem. The events labelled by ^ 2 , the 
only transition of L, are 2 and 11. The corresponding markings are Mark{[2]) = 
{p 2 ,Pi} and Mark{[ll]) = {p4,Pt}- So CP = { {p2,P4}, {Pi^Pr} }• 

5.2 The Tableau System 

Let {Ml, . . . , Mn} be the set of checkpoints obtained in the first phase. We will 
use El, ... ,Sn to denote the net systems {Ninv, Mi), . . . , M„). 

Definition 5. Let BPi, . . . , BPn be branching processes of Ei, . . . , E„, respec- 
tively. An event e of BPi is a repeat (with respect to if there is an index 
j <i and an event e! in BPj, called the companion of e, such that Mark{[e']) = 
Mark([e\), and either 

(I) j < i, or 

(II) i = j and e' < e, or 

(III) 1=3, ={e' < e), [e'] A [e], and |[e']| > |[e]|. 

A repeat e of BPi is a terminal if BPi contains no repeat e' < e. Repeats, 
and in particular terminals, are of type I, II, or III, according to the condition 
they satisfy. A repeat e with companion e' is successful if it is of type II, and 
unsuccessful otherwise. 

A tableau is a tuple BPi, . . . , BPn of branching processes of Ei, . . . , E„ such 
that for every 1 < i < n and for every possible extension e of BPi at least one 
of the immediate causal predecessors of e is a terminal. Each BPi is called a 
tableau component. A tableau is successful if at least one of its terminals is 
successful. 

Observe that an event of BPi can be a repeat because of an event that 
belongs to another branching process BPj . The definition of repeat depends on 
the order of the checkpoints, but the tableau system defined above is sound and 
complete for any fixed order. Because the definition of the tableau component 
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Fig. 3. The tableau system for the illegal livelock problem 



BPi depends only on the components with a smaller index, we can create the 
tableau components in increasing i order. Tableau components are constructed 
as for the illegal w-trace problem, using the new definition of terminal. 

The tableau for our example is shown in Fig.0 The names of places and tran- 
sitions have been chosen to match “pieces” of the unfolding in Fig. 0 The first 
tableau component contains no terminals; the construction terminates because 
no event labelled by an invisible transition can be added. In the second compo- 
nent, event 12 is a terminal with event 3 in the first component as companion. 
The intuition is that we don’t need to unfold beyond 12 in the second component, 
because what we construct can be found after 3 in the first component. 

Similarly to the case of the illegal w-trace problem, a terminal of type II 
corresponds to a cycle in the reachability graph. Since the transitions of N^v 
are all invisible, such a cycle always originates an illegal livelock, and so terminals 
of type II are always successful. For terminals of type III, the intuition is that 
t[e] need not be constructed, because it is isomorphic to \[e']. The condition 
|[e']| > |[e]| is required for completeness (see 0). We have the following result: 

Theorem 3. Let 7i, . ■ ■ ,7n be a tableau of Si, , S„ constructed according to 
a total adequate order 

— Ti, . . . ,%i is successful if and only if S contains an illegal livelock. 

— Ti, ■ . ■ ,Tn contain together at most ■ B non-terminal events. 



5.3 A Tableau System for the 1-Safe Case 

If S is I-safe then we can modify the tableau system to obtain a bound of 
non-terminal events. We modify the definition of the repeats of type II and III: 

(IF) i = j and -i(e'#e), or 
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(Iir) i = j, e'#e, [e'\ A [e], and |[e']| > |[e]|. 

Theorem 4. Let S be 1-safe. Let 7i,. ..,'Tn be a tableau of Ei,...,Sn con- 
structed according to a total adequate order and to the new definition of 
repeats of type II and III. 

— Ti,. . . ,Tn is successful if and only if E contains an illegal livelock. 

~ Ti, . . . ,Tn contain together at most non-terminal events. 

6 A Tableau System for LTL Model Checking 

Putting the tableau systems of Sections El and El together, we obtain a tableau 
system for the model checking problem of tLTL' . For the sake of clarity we have 
considered the illegal w-trace problem and the illegal livelock problem separately. 
However, when implementing the tableau systems there is no reason to do so. 
Since all the branching processes we need to construct are “embedded” in the 
unfolding of it suffices in fact to construct one single branching process, 

namely the union of all the processes needed to solve both problems. 

Clearly, this prefix contains 0{K^ ■ B) non-terminal events. If the system 
is presented as a 1-safe Petri net, then the prefix contains 0{K‘^) non-terminal 
events because the following two conditions hold: (i) None of the reachable mark- 
ings of the synchronization enable two /-transitions concurrently, (ii) If the 
system is a 1-safe Petri net, then the synchronization is also 1-safe. 

7 Conclusions 

We have presented a new unfolding technique for checking LTL-properties. We 
first make use of the automat a-theoretic approach to model checking: a combined 
system is constructed as the product of the system itself and of an automaton 
for the negation of the property to be checked. The model checking problem 
reduces to the illegal w-trace problem and to the illegal livelock problem for the 
combined system. Both problems are solved by constructing certain prefixes of 
the net unfolding of the combined system. In fact, it suffices to construct the 
union of these prefixes. 

The prefixes can be seen as tableau systems for the illegal w-trace and the 
illegal livelock problem. We have proved soundness and completeness of these 
tableau systems, and we have given an upper bound on the size of the tableau. 
For systems presented as 1-safe Petri nets or products of automata, tableaux 
contain at most size 0{K^) (non-terminal) events, where K is the number of 
reachable states of the system. An interesting open problem is the existence of 
a better tableau system such that tableaux contain at most 0{K) events. We 
conjecture that it doesn’t exist. 

The main advantage of our approach is its simplicity. Wallner’s approach pro- 
ceeds in two steps: construction of a complete prefix, and then construction of a 
graph. The definition of a graph is non-trivial, and the graph itself can be expo- 
nential in the size of the complete prefix. Our approach makes the construction 
of the graph unnecessary. The price to pay is a larger prefix. 
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Abstract. We consider the problem of reasoning about message based 
systems in finite state environments. Two notions of hnite state envi- 
ronments are discussed: bounded buffers and implicit buffers. The former 
notion is standard, whereby the sender gets blocked when the buffer is 
full. In the latter, the sender proceeds as if the buffer were unbounded, 
but the system has bounded memory and hence “forgets” some of the 
messages. The computations of such systems are given as communication 
diagrams. We present a linear time temporal logic which is interpreted on 
n-agent diagrams. The formulas of the logic specify local properties using 
standard temporal modalities and a basic communication modality. The 
satishability and model checking problems for the logic are shown to be 
decidable for both buffered products and implicit products. An example 
of system specification in the logic is discussed. 



1 Motivation 

In distributed systems, the computations of autonomous agents proceed asyn- 
chronously and communication is by message passing. When the medium of 
communication buffers messages, the sender is assumed to put messages into the 
medium and proceed in its computation whereas a receiver may need to wait for 
message delivery. This notion that the sender is not blocked is often referred to 
as the unbounded buffer abstraetion (msni), and is widely used in the design of 
agents in distributed systems. In addition, assumptions are made about whether 
message delivery is guaranteed or not, about whether messages are delivered in 
the order in which they were sent, and so on. Most distributed algorithms work 
on variants of such models ( |Lyn96| ) . 

While the assumption that buffers are unbounded abstracts away messy de- 
tails and hence is welcome, implementations need to work with bounded buffers. 
But then, when such a buffer is full, the sender must wait till one of the el- 
ements has been removed. We need to modify the original design, and since 
each change may cascade a number of changes in the pattern of communication 

* We thank Kamal Lodaya and the reviewers for helpful discussions, and Laura Semini 
for bringing to our attention mm on which Section 0is based. 
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between agents, we may end up with deadlock or other disastrous possibilities, 
necessitating verification again. Therefore, we would like to have a methodology 
whereby systems are developed without reference to buffers, and system prop- 
erties are stated without mentioning buffer status, but buffers figure only in the 
analysis. In some sense, buffer behaviour should be abstracted as environment 
behaviour, where the environment is (hopefully) finite state. 

When both the agents in the system and the environment are finite state, 
automata-theoretic techniques can be employed in the verification of temporal 
properties fsm- In this paper, we follow such a framework in the context of 
message passing in finite state environments. We consider systems of finite state 
automata with communication constraints between them. The computations of 
these systems are given as communication diagrams, a slight generalization of 
Lamport diagrams and message sequence charts (MSCs). We study two no- 

tions of finite state environments: product of automata defined by bounded buffers 
and that defined by implicit buffers. The former notion is standard, whereby the 
sender gets blocked when the buffer is full. In the latter, the sender proceeds as 
if the buffer were unbounded, but the system has bounded memory and hence 
“forgets” some of the messages, or the order in which messages were sent. In our 
automaton model, communications are not observable system transitions, but 
only define a coupling relation that constrains the possible runs of the system; 
this is a departure from the way such systems are typically modelled in process 
algebras or net theory. 

The diagrams suggest a natural linear time temporal logic in which the for- 
mulas specify local properties using standard temporal modalities and a basic 
communication modality. The satisfiability and model checking problems for the 
logic are shown to be decidable for both buffered products and implicit products. 
We illustrate how this logic can be employed in reasoning about message based 
systems using a conference coordination example suggested by IMb99l . ICiN'i'98l . 
This system consists of authors who submit papers, reviewers who examine the 
papers and a moderator who communicates the results. 

A distinct feature of the proposed logic is that it uses local assertions as in 
. In the context of bounded buffer products, this means that the logic can- 
not force buffers of specific size. Most of the locally based logics in the literature 
study systems where communication is only by synchronous means, or, as is the 
case with [lbRT92] . study subclasses of event structures, where the model check- 
ing problem is not addressed. The work of |HNW99] does consider the model 
checking problem using finite unfoldings of a class of nets, but since they do not 
study linearizations of the associated partial order, these difficulties related to 
buffers do not arise there. 

|AY99| also present results about model checking message passing systems 
where the computations are MSCs. The difference is that our approach is dual: 
as put it neatly, the communicating state machine model (that we study) 

is a parallel composition of sequential machines, while the MSC model (that they 
discuss) is a sequential composition of concurrent executions. 
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2 Communicating Automata 

Throughout the paper, we fix n > 0, and study distributed systems of n agents. 
We will follow the linguistic convention that the term ‘system’ will refer to 
composite distributed systems, and ‘agent’ to component processes in systems. 
In any system, the agents are referred to by the indices i, 1 < i < n. We will use 
the notation [n] to denote the set {1,2,..., n}. 

A distributed alphabet for such systems is an n-tuple E = (Ai, . . . , i7„), 
where for each i G [n], Ei is a finite nonempty alphabet of i-actions and for all 
i yf j, Ei n Ej = 0. The set of system actions is the set E' = {A} U E, where 
E = y^Ei. A is referred to as the communication action. We will use a, 6, c etc 

i 

to refer to elements of E and r, r' etc to refer to those of E' . 

Definition 2.1 A System of n commnnicating automata (SCA) on E is 

a tuple S = ((Qi, Gi), . . . , (Q„, G„), — >■, /nit), where for j € [n], Qj is a finite 
set of states, Gj C Qj, Init C (Qi x . . . x Q„) and for i yf j, Qi fl Qj = 0. Let 
Q = \J Qi- — {Q X E' X Q) such that if q^q' then either there exists i such 

I 

that {q,q'} Q Qi and t € Ei, or there exist i ^ j such that q G Qi,q' G Qj and 
T = A. 

Init contains the (global) initial states of the system and Gj are the (lo- 
cal) good states of agent j. We will use the notation *q {q' \ q'A-q} and 
q* ‘= {q' I q^q'}- The set Q (Qi x . . . x Qn), refers to global states, and 
when u = {qi , . . . , g„) G Q, u[i] refers to qi. 

Note that — >■ above is not a global transition relation, it consists of local 
transition relations, one for each automaton, and communication constraints of the 
form q-^q' , where q and q' are states of different automata. The latter define a 
coupling relation rather than a transition. The interpretation of local transition 
relations is standard: when the i*^ automaton is in state qi and reads input 
a G Ei, it can move to a state q 2 and be ready for the next input if {qi, a, ( 72 ) 

The interpretation of the communication constraint is non-standard and depends 
only on automaton states, not on input. When q-^q', where q G Qi and q' G Qj, 
it constrains the system behaviour as follows: whenever automaton i is in state 
q, it puts out a message whose content is q and intended recipient is j; whenever 
automaton j needs to enter state q', it checks its environment to see if a message 
of the form q from i is available for it, and waits indefinitely otherwise. If a 
system S has no A constraints at all, the automata in it proceed asynchronously 
and do not wait for each other. 

However, computations where every receipt of a message is matched with a 
corresponding ‘send’ lead to non-regular behaviour. We therefore constrain the 
interpretation of A constraints to reflect finite environments. The first one we 
consider is that of bounded buffers. When q^q' , q G Qi, q' G Qj, when i needs to 
enter state q, it checks whether the bounded buffer constituting the environment 
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is full; if it is, then it waits till it is no longer full, then puts in a message q for j 
and proceeds. The product construction defined below is designed to reflect this 
intuition. 



2.1 Bounded Buffers 

To keep the presentation simple, we will focus only on 1-buffered runs. (In the next 
section, we will see that unit buffers suffice for the logic presented, justifying this 
decision.) The product definition below can be generalized for fc-buffered runs, 
for any k > 0, but becomes technically more involved. 

Definition 2.2 Given an SCA S = ((Qi, Gi), ■ . (Q«, G„), — >■, Init) on E, the 

1-product of the system is defined to be Prg = /, G, =^), where X = QxB, 

I = {Init X {0}), G = (Gi, . . . , Gn) and B — {B C ([n] x Q) \ if {i, q) G B then 
q ^ Qi, and for each i,j, there exists at most one {i,q) G B where q € Qj}- 
{X X E X X) is defined by: {qi, . . . ,q^, B)^{q[, . . . ,q'^, B'), a€ E„ iff 

1. qi-^q{, and for all j yf i, qj = q'j. 

(*9i Qj) = i? yf 0, then there exists q G R such that {i,q) G B and 
B' = B-{{i,q)}. 

3. if {qi • n Qj) yf 0, then {{j, q) G B \ q G Qi} = 0 and B' = B U {{j, qi)}. 

A state q G Qi is said to be terminal if {q' G Qi \ q-^q' for some a G Ei} = 0. 
For w G E‘^ , we use the notation w\i to denote the restriction of w to Ei. 

Computations of S are defined by runs of Prs. We consider only infinite 
behaviours in this paper. Let w = a\a 2 . . . G E^ . An infinite run p = xqX\ . . . 
on rc is a sequence where for k > 0, Xk ^^Xk+i. We say that i terminates in p 
if there exists k such that Xk[i] is terminal, p is said to be good if for all i G [n], 

w\i is infinite or i terminates in p. Let Inffip) {q ^ Qi \ for infinitely 
many k > 0, Xk[i] = q}. The run p on w is said to be accepting iff p is good, 
xq G I, and for all i, Inffip) fl Gi yf 0. The language accepted by Prs, denoted 

C^{S) {w G E‘^ I S has an accepting run on w}. Note that, if Prs has no 
infinite runs at all, then C^{S) = 0. A consequence of the definition of good 
runs is that no agent gets stuck because of the buffer condition. Figure [Ogives 
a 2-agent system and its 1-buffered product. 

Even though we refer to these products as 1-buffered, there are really n{n — l) 
unit buffers in the system, one for each pair {i,j), i yf j, each containing at most 
one message. Emptiness checking for an SCA can be done in time linear in the 
size of the constructed product. We have: 

Lemma 2.3 Given an SGA S ofn automata, checking whether C^{S) = 0 can 
be done in time where k is the maximum of {\Qi\ \ i G [n]}. 
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Fig. 1. 1-buffered prodnct 



2.2 Implicit Buffers 

We now study an alternative to bounded buffers, based on a more abstract notion 
of finite state environment. For this, recall the notion of messages in SC As: an 
edge q^q', with q G Qi and q' G Qj, represents a constraint on runs; q' cannot 
be visited in the run until q has been visited prior to it. The difficulty comes 
when q is visited several times; if each visit is recorded as a message in the buffer, 
unless i is made to wait, we require unbounded memory. On the other hand, if 
t is in a cycle, should we consider each visit as a new message ? Or, if i cycles 
between qg and qi , is it important for the receiver to know whether the cycle was 
entered at qo or at qi 7 These considerations lead us to the notion of implicit 
products of automata in systems below. 

Definition 2.4 Given an SCA S = {{Qi,Gi), (<5„,G„), ~^,Init) on S, 

the implicit buffered product of the system is defined to be the tuple 
= (A, /, G, =J>), where X = Q x B, I = {Init x {0}), G = (Gi,...,G„) and 
B = {B C ([n] X Q) \ if (i, q) G B then q ^ Qi}- (A x A x A) is defined by: 
{qi,...,qn,B)^{q[,...,q'„,B'), a G Si, iff 

1. qi-^q'i, and for all j yf i, qj = q'j. 

2. if {*q'i n Qj) = i? yf 0, then there exists q G R such that (i,q) G B and 
B' = B-{{i,q)}. 

3. if {qi • n Qj) yf 0, then B' = Bid {{j, qi)}. 

The language accepted by the system, denoted £imp{S), is defined exactly as 
before. Clearly, we have a notion here that’s very different from the previous one. 
In fact it is easy to construct an SCA S such that Cimp{S) is infinite, but C^{S) 
is empty, with the system deadlocked. In implicit buffered products, the sender 
is never blocked. However, every agent i can offer a subset of Qi for each j i 
as messages in the buffer, and hence the complexity of checking for emptiness is 
higher. 
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Lemma 2.5 Given an SCA S of n automata, checking whether Limp{S) = 0 
can he done in time where k is the maximum of {\Qt\ \ i € [n]}. 

3 Lamport Diagrams and a Temporal Logic 

The computations of systems of communicating automata can be seen as Lamport 
diagrams, which depict local runs and message dependence between them. 

Definition 3.1 A Lamport diagram is a tuple D = {Ei, . . . , En,<), where 
Ei is the set o/ event occurrences of agent i, < is a partial order on E '= Ei 

i 

called the causality relation such that: for all i € {1, • ■ • ,n}, Ei is totally ordered 
by <, and for all e G E, ]e |e' | e' < e} is finite. 

Since for all e G E, ^eis finite, < must be discrete. Hence there exists < C<, 
the immediate causality relation, which generates the causality relation; that is: 
for all e,e',e", if e< e' and e < e" < e' then e" G {e, e'}. We have: <= (< )*. 
When e G Ei, e' G Ej, i ^ j and e< e', we interpret e as the sending of a 
message by agent i and e' its corresponding receipt by j. 

Let e G Ei. We can think of Jc as the local state of agent i when the event 
e has just occurred. This state contains the information that i has up till that 
instant in the computation, which contains it own local history and that of others 
according to the latest communication from them. The empty set corresponds 
to the initial state, where no i-event has occurred, and is denoted by e^. Let the 

def 

set of all local states of agent i be denoted LCi = {ci} U {\e \ e G Ei} and let 
LC y^LCi. We use d,d' etc to denote local states. We can extend the < 

i 

relation to local states as follows: let d\ G LCj and ^2 G LCp, we say di< ^2 iff 
d\ C ^2 and for all d G LCj, if d C d 2 , then d C di as well; that is, di is the last 
j-local state seen by t at ^ 2 - 

Given a Lamport diagram D = {E\, . . . ,En,<), a sequentialization of D 
is any sequence a = cqCi . . . such that E = {eo,ei,...} and for all A: > 0, 

jcfc C {eo, . . . , 6fc}; that is, ct is a linear order that respects <. Let a = eoCi ... be 

a sequentialization of D. We say cr is k-bounded iff the following property holds: 
suppose {hi, . . . ,hk, hk+i} C Ei such that hi < . . . < hk+i, for alH : 1 < ? < k, 
gi is the j-maximal event in }h, where h is any event such that hi < h < hi^i, 
and gk+i is the j-maximal event in fiik+i', then gk+i occurs later than hi in a. 
It is easy to see that, when k = 1, this is basically the same notion as that of 
1-buffered runs studied in the previous section. 

3.1 Prom Runs to Lamport Diagrams 

Consider an SCA S; suppose that Prs has an infinite run p = xqXi..., on 

w = ai 02 . . . G i.e., for k > 0, Xk Xk+i. p induces a clock function 

X '. (M X [n] X N) — >■ N which records, for each pair of agents i,j and each 
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instance k, the latest instant at which i last heard from the agent j at k. We 
define x(i, j, k) by induction on k. Set x(i, j, 0) = 0 for all i,j. Suppose x(b J) k) 
is inductively defined and *p{k+ l)[z] n Qj yf 0; then, between xihjjk) and 
k, there could be many instances I when j could have sent a ‘message’ to i 
{p[^]{j) * n Qi yf 0). However, since p is 1-buffered, I is unique and x(bi> fc + 1) 
is set to 1. 

The Lamport diagram Dp associated with p is defined as follows. Every 
transition Xk ^^Xk+i of p is recorded as an event (fc, /c + 1) of Ei in Dp, where 
Uk+i G Ei. The local causal order <i of agent i consists of all pairs of the 
form {{k,k + + 1)) such that k < 1. For communication edges, define 

(m — 1, m)< c(k,k+l) iff (m — 1, m) € Ej, (k,k+l) € Ei, i yf j and x(b J) k) < 
k + 1) = m. Now we define <= ((U^ <i) U < c)* and it is easy to see that 
Dp is a Lamport diagram. Thus, to each infinite run p of Prs, we can associate 
a Lamport diagram Dp. 

3.2 A Temporal Logic 

We now proceed to define the temporal logic, which we call m-LTL in which 
we can reason about local assertions on Lamport diagrams. A crucial aspect 
of the logic is the asymmetry in the communication modality: the receiver of 
a message gets information about the sender’s past, whereas the sender cannot 
access the receiver’s future. This is because, in distributed systems, when we 
cannot make any assumptions about relative speeds of processes, the sender 
cannot, in general, infer any information about the status of the receiver at the 
time of message receipt. On the other hand, local information (like, “I have sent 
a message”) can always be maintained using local propositions by the sender. 

Fix countable sets of propositional letters {P\,P 2 , . . ■ , Pn), where Pi consists 
of the atomic local properties of agent i. Let P [jHi. 

i 

Let i G [n]. The syntax of t-local formulas is given below: 

<Pi ::= p G Pi \ -• a \ ai V «2 | O « I «i U 02 | 0j a, j yf a G (Pj 

Global formulas are obtained by boolean combination of local formulas: 

<E ::= a@i, a G d>i \ ^ ip \ tfi V ip 2 

The propositional connectives (A, 3 , =) and derived temporal modalities 
(O, □) are defined as usual. The formulas are interpreted on Lamport diagrams. 
For technical convenience, we consider only infinite behaviours. Formally, models 
are pairs of the form M = (D,V), where D = {Ei, . . . , En,<) is a Lamport 
diagram such that E = \^ Ei 'is & countably infinite set and V : LC -G 2^ is the 

i 

valuation map such that for d G LCi, V{d) C Pj. 

Let a G <Pi and d G LCi. The notion that a holds in the local state d of agent 
i in model M is denoted M, d \=i a, and is defined inductively as usual: 
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— M,d^ipiSpeV{d). 

— M, d \=i —•a iff M, d a. 

~ M, d l=i a V /3 if[ M, d j=i a or M, d j=i (3. 

— M,d\=i Qa iff there exists d' G LCi such that d< d' and M, d' \=i a. 

— M,d\=ia\Jl3 iff 3d' e LCp.dCd',M,d' \=i P and Vd" G LC* : d C d" C d' : 
M, d" \=i a. 

— M,d\=i (Z)jO iff there exists d' G LCj such that d'< d and M, d' \=j a. 

The new modality 0j«, asserted by i, says that a held in the last j-local 
state visible to i. For global formulas, the corresponding notion is defined only 
at initial states, in terms of the notion defined above for local formulas. 

— M \= a@i iff M, \=i a; M \= -rip lE M Y= ip] 

M \= ipiM ip 2 iS. M \= ipi or M \= ip 2 - 

We say that ip is satisfiable iff there exists a model M such that M \= ip. We 
say that ip is valid if for every model M, we have M \= ip. 

A typical specification in the logic has the form: (□(pA 02“' ‘OAT’ 3 0(d A 
02 ‘OAT’)))@l, which asserts that agent 1 can make a transition from a state 
satisfying p into a state in which q holds only after hearing an ^OK' from agent 
2, and must block otherwise. 

3.3 Model Checking 

Every (1-buffered) infinite run of an SC A gives rise to a Lamport diagram, and 
formulas of m-LTL are interpreted over such diagrams. We are thus in a position 
to formulate the problem we began with: can we automatically check that all 
runs of a given SCA satisfy a specification ip in the logic m-LTL ? 

To make this precise, we define an interpreted system to be a pair S = 
(S,Val), where S = ((Qi,Gi), . . ., {Qn,Gn), ~^,Init) on S, V al : Q ^ 2^ 
such that for all q G Qi, V al{q) C P^. Now, every infinite run p = XqXi ... of S' 
defines a Lamport diagram Dp as we have seen above; we define the associated 

model Mp = {Dp, Vp), where Vp{ei) Val{xo[i]), and for all e = {k,k+ 1) in 

Ei, Vp{le) Val{xk+i[i])- It can be easily checked that Vp is indeed a legal 
valuation. We say that an interpreted system S = (S,Val) satisfies a formula 
ip of m-LTL iff for every accepting run p of S, the associated model {Dp, Vp) 
induced by Val and S satisfies ip. We denote this by S ^ V'- 

Theorem 3.2 Let-ip be a m-LTL formula of length m. Satisfiability ofip over n- 
agent Lamport diagrams can be checked in time . Let S be an interpreted 

SCA with k being the maximum of {\Qi\ \ i G [n]}. Then the question S \=' ip 
can be answered in time ^Oin) 20 (mn) ^ 

The theorem is proved by associating a system Sp over the distributed al- 
phabet (2^1 , • ■ • , 2^” ) with every formula ip such that {Sp) in some way corre- 
sponds exactly to the class of models of ip. This correspondence is made precise 
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as follows: let M |= where M = (Z?, V) and let cr = 6162 ... be any 1 -bounded 

sequentialization of D. Let Proper = V {]pi)V {]^2) Now, define Lang{ip) to 

be the set of all such Proper ; it is a subset of Moreover, M defines an 

n-tuple < V{e \), . . . , V{en) >■ Let V{tp) denote the set of all such tuples. Then 
what we do is to associate an interpreted system {Srf,, V alrp) with every formula 
ip such that £^{S,p) = Lang{tp) and Val,p{Init) = V{ip), where Init is the set 
of global initial states of S and Val is applied pointwise on it. 

Now, checking decidability of a formula amounts to checking whether the 
associated system accepts a nonempty language. For model checking, first note 
that languages accepted by 1-buffered products are closed under intersection. 
Hence, we can check emptiness of the language obtained by the intersection 
C^{S) n and answer accordingly whether the given interpreted system 

satisfies ip or not. (Proof details are available from the authors.) 

4 Implicit Buffers 

The computations of implicit products cannot be presented as Lamport dia- 
grams, since ‘communication edges’ can cross each other now, as they do in sys- 
tems where messages are not guaranteed to be delivered in the order in which 
they were sent. Moreover, since repeated visits to ‘send’ states are possible with- 
out ever visiting corresponding ‘receive’ states, the modelled systems do not pro- 
vide guaranteed message delivery either. We now generalize Lamport diagrams 
to include these possibilities formally and reinterpret our logic on them. 

Definition 4.1 A communication diagram is a tuple C = . . ., 

(F'n, <ra); <c), where for all i G [n], <i is a total order on Ei and <cC {E x E) 
where E = \^ Ei. Fori yf j, EiC\Ej = 0 and <c satisfies the following conditions: 

i 

1 . <= (1^ <i U <c)* Is acyclic. 

i 

2. If e <c e! and e € Ei, then e! ^ Ei, and 

3. For all e € Ei, for each j ^ i, e has at most one <c successor and at most 
one <c predecessor in Ej. 

Above, we refer to {E, <) as the poset generated by C. 

Given any Lamport diagram D = {Ei , . . . , i?„, <), the structure defined by 
Cd = ((L'lj <i), • • •, (En,<n), <c) is a communication diagram, where <i=< 
n{Ei X Ei) and <c= < ~ On the other hand, given any communication 

i 

diagram C, we can associate a structure Dc = {Ex , . . . , A„, <) with it, where < 
is the partial order generated by C. However, there is an important difference 
between these classes of diagrams, as the following proposition shows. 

Proposition 4.2 Let D, C he as above: Dcd is isomorphic to D. Cdc is iso- 
morphic to C only if C satisfies the following condition: for all ei,C2 G Ei, 
fi,f2 G Ej, i yf j, if ei <* 62, ei <c fi and 62 <c /2, then fi <j f^- 




496 B. Meenakshi and R. Ramanujam 



Thus communication diagrams generalize Lamport diagrams. We can now 
interpret m-LTL on these frames. The notion ]e now refers to the generated 
relation <. The only changes are in the semantics of the communication modality. 

— Let d € LCi and j yf i. 



M, d \=i (Z)ja iff 3 e G : d = 4 fi, 3 e' G Ej : e' <c e and M,\e' \=j a 



— An implicit model is a pair M = (C, V), where C is a communication diagram 
such that E is countable, and V : LC — >■ 2 ^ is the valuation map such that 
for d G LCi, V{d) C P,. 



We now speak of implicit models for a formula, and of a formula being im- 
plicitly satisfiable. Note that the 0 ^ modality now refers to the presence of an 
explicit edge, e' may be the j-maximal event occurrence in 4e without e'< e; in 
the earlier semantics such an e' could witness the truth of 0yO! at but this is 
not possible with the semantics presented here. 

Given a formula we can construct a system in a similar manner as 
before. We omit the proof and mention only the results. 

Theorem 4.3 Implicit satisfiability of a formula ip in m-LTL over n-agent com- 
munication diagrams can be decided in time 2 '^^^ " \ where m = lip]. Let S be 
an interpreted SCA with k being the maximum of {\Qi\ \ i G [n]}. Then the 
question S \=Jrnp V' can be answered in time fcO(n) 20(2 n ) ^ 

5 A System Specification 

In this section we consider a simplified version of a case study igntdsi . mm . 
The system has five components — two authors {A\ and A2), a moderator M 
and two reviewers (i?i and i?2)- Both authors submit papers to M who passes 
them on to both reviewers who review each submission, and send the result 
to M. A paper is accepted only if both the reviewers choose to accept it. M 
communicates results to authors and there is no direct communication between 
authors and reviewers. For simplification we assume that papers are considered 
one at a time. 

The system is modelled as an SCA given in Figure El For simplification, we 
have just shown the automata corresponding to Ai, M, R\. The A-constraints 
across the system are shown by dotted-directed lines pointing to the send- 
ing/receiving states. The A-constraints associated with the states p'^ and s' (of 
M and R\ respectively) are symmetric to those associated with the states pi and 
Sj respectively and hence not mentioned. 

Ai submits at state go (a A constraint to state pi of M), and then waits for 
result: accept is state qi (A edge from pg of M), reject is 92, and a positive result 
for author is A2 is recorded in q^. M has two symmetric sub-components, one 
for Ai and the other for A2. When it gets a submission (state pi), it sends it 
to i?i and i?2 and waits for the four possible outcomes, one of which is accept 
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peiM) pt(M) p'eiM) 



4 




4 




Fig. 2. Paper review system 



(state P2), the rest being reject (states P3 through p^); these are accordingly 
communicated in states pe and p^. R\ is simple: it gets a submission, makes a 
binary choice and reverts to waiting. 

With each component a set of propositions is associated. The intended mean- 
ing of each proposition should be clear from the name. With Ap. P\ = {resulti, 
pli = 0, pli = l,ph = 2}. With M: P3 = {wait, proci, proc2} U {FB^ = x \ 
X € Dec,i = 1,2} U {i? = x \ x & Dec} where Dec = {acc,rej,nil}. With Ri: 
Pi = [readyi , reviewi{i) , decidedi{i) , OKi(i) \ i = 1 , 2}. The set of propositions 
P2 for author A2 and P5 for reviewer R2 correspond to P\ and Pi respectively. 
pli indicates the ‘publication list’ of i. Note that each author keeps a record of 
the current list. FBi stands for feedback from i. 
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The valuation map is suitably defined from the automaton description. The 
following are some examples of local properties expressed in m-LTL, which the 
1-buffered product of the system in Figure 0 satisfies. 

1. M-formula: □(proci A (di{-^resulti Aph = 0) d 
{{proci A -iwait)\J {R = accV R = rej)))) 

2. M-formula: □((proci A (i? = acc)) d 

(04(decic?edi(l) A A 05 (decjded 2 (l) A OK 2 {l)))) 

3. ^i-formula: □(p/i = 2 d (Z> 3 {proc 2 A R = acc)) 

4. i?i-formula: 0(reviewi{l) 3 0i(-iresMlti A ph = 0)) 

Interestingly, i?i can assert this despite having no direct contact with Ai: this 
is because it can assert 030 i(-'resMZti Apl\ = 0) above, and by the semantics 
of the modality, the use of 0i suffices. 

Remark: m-LTL is based on local assertions, and hence the sender of a message 
cannot refer to the state of the receiver on receiving the message. However we 
can use propositions Sj € Pi, i ^ j, whereby i asserts that a message meant for j 
has been sent; this is a local property and does not refer to receiver states. The 
technical results can be easily extended, and we get additional convenience in 
specification. We can also study extensions of the logic with past tense and global 
modalities; in the presence of the ‘send’-propositions above, global formulas can 
force buffers of specified size in buffered products. 
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Abstract. In this paper we introduce two notions of security: multi-user 
indistinguishability and multi-user non-malleability. We believe that they 
encompass the correct requirements for public key encryption schemes 
in the context of multicast communications. A precise and non-trivial 
analysis proves that they are equivalent to the former single-user notions, 
provided the number of participants is polynomial. We also introduce a 
new dehnition for non-malleability which is simpler than those currently 
in use. We believe that our results are of practical signihcance: especially 
they support the use of PKCS^^l v.2 based on OAEP in the multicast 
setting. 

Keywords: Multicast encryption, semantic security, non-malleability. 



1 Introduction 

1.1 Motivation 

With the growth of wide area networks, cryptographic tools often have to coexist 
and perform related computations. This may raise new security concerns. For 
example, broadcast encryption has been the subject of several specific attacks, 
notably directed against low-exponent RSA m- Basically, if e is the common 
public exponent, then e encryptions of a given message under different public 
keys lead to an easy recovery of the plaintext. Further results by H&tad ITIE^ 
and Coppersmith m proved that “time stamp” variants of broadcast, attaching 
time to the message before encryption, can be successfully cryptanalyzed with 
e encrypted messages. So far, most known attacks against RSA assume that 
related plaintexts have been encrypted to different destinations, which enables 
an eavesdropper to take advantage of the strong dependences between the RSA 
permutations, although each one is individually one-way. 

Despite these attacks, RSA with small exponents is the de facto standard 
and multicast encryption is performed in many products by encapsulating a 
symmetric key within several RSA encryptions together with side data which 
are specific to each receiver. This is precisely the context that we wish to address 
and we believe that the related security issues needed to be cleared up in order 
to ensure confidence in standard designs that allow multicast encryption such as 
PKCS#1. Thus, albeit technical, our research is of practical significance. 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 499-^1^ 2000. 

© Springer- Verlag Berlin Heidelberg 2000 
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1.2 Notions of Security for Encryption 

In this paper, we wish to propose notions of security that adequately prevent 
the attacks just mentioned. Usually, a security level is analyzed in terms of the 
goal and power of an adversary. The ultimate goal that can be achieved is called 
invertibility: given a public key and an encryption of m, retrieve the whole plain- 
text m. The RSA assumption implies that the basic RSA encryption scheme is 
non-invertible. As shown in the above example, the related notion dramatically 
collapses in a broadcast attack. In a different context, stronger notions of secu- 
rity, have been proposed. Goldwasser and Micali define semantic security ini 
(also called indistinguishability) as the inability for an adversary to distinguish 
encryptions of two plaintexts. This requires probabilistic encryption, where each 
plaintext has many corresponding ciphertexts, depending on a random parame- 
ter. Recent successful attacks against RSA-like cryptosystems 0 based on known 
plaintext relations stresses the need for proven schemes achieving semantic se- 
curity. 

Surprisingly, the relationship between broadcast attacks and the improved 
notions of security has not been the subject of specific research, even if known 
cryptanalyses seem to fail against semantic security. The motivation of this paper 
is to investigate whether semantic security, contrary to invertibility, is robust in 
scenarii involving a general notion of multicast. Our first result gives a positive 
answer: if one can gain a bit of information by considering a specific set of 
multicast encrypted messages, then at least one scheme used for encryption is not 
semantically secure. The proof relies on the hybrid technique and is conceptually 
simple. It is an independant work of Bellare, Boldyreva and Micali who adressed 
the same problem Q- 

Next, we develop a similar analysis with the notion of non-malleability, in- 
troduced by Dolev, Dwork and Naor m Informally, the notion asserts that, 
given a ciphertext, it should be impossible to generate a different ciphertext 
so that the respective plaintexts are related. The problem of encrypted bids is 
a famous situation where an eavesdropper may try to under-bid a ciphertext 
of an unknown amount s, without learning anything about s. This is precisely 
what non-malleability tries to prevent. A broadcast scenario may be envisioned 
where several recipients collect the bids over a network. The multicast notion 
requires that the view of many encrypted messages under different public keys 
gives no advantage in producing the encryption of a related plaintext. Again, 
we prove that our new definition of multi-user non-malleability is equivalent to 
the former single-user notion: no broadcast attack can be performed against a 
non-malleable scheme. Here, the reduction is definitely much harder to obtain. 
Due to the complex nature of the definitions, involving auxiliary distributions of 
plaintexts and binary relations, both issued by the attacker, our previous natural 
reduction cannot be applied. The major technical point of the proof relies on a 
lemma embedding any distribution into the product of a 2 element-distribution 
which leads to a simpler definition of non-malleability. We think that this lemma 
may be of independent interest to cryptographers. 
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We now discuss the notion of security in terms of the adversary’s power. 
Usually, an attacker is a probabilistic polynomial time Turing machine running 
in two stages. Firstly, given a public key, it achieves a precomputation stage 
and halts. From the output data, a challenge is randomly encrypted and given 
to the attacker which performs a second stage of computation. The polynomial 
strength of the attacker may be increased by providing him access to a decryption 
oracle. Whether the oracle is accessible during the first stage only or during 
whole computation leads to three different scenarii. Under a chosen-plaintext 
attack the adversary can obtain ciphertexts of his choice, which is meaningless 
in the context of public key encryption. Under chosen ciphertext attack [El, 
the adversary is allowed to use a decryption oracle during the precomputation 
stage only. Lastly, under adaptive chosen ciphertext attack [HI, the adversary 
is allowed to use a decryption oracle during whole algorithm, with the trivial 
restriction that the challenge cannot be asked to the oracle. The latter is the 
ideal candidate that one should consider in order to provide the best arguments 
for security. In our paper, whenever a theorem is stated, it is assumed that one 
of the three contexts given above has been fixed and hence no decryption oracle 
is mentioned; potential oracles are preferably viewed as internal parts of the 
attacker. 



1.3 Outline of the Paper 

The rest of the paper is organized as follows. Section 2 gives common defini- 
tions and notations for encryption and probabilities. Sections 3 and 4 contain 
our analysis of semantic security (which we call indistinguishability) and non- 
malleability. Both introduce definitions of these notions in the context of multi- 
cast. The conclusion follows in section 5. 

2 Definitions and Notations 

A public key encryption scheme 77 is a triplet (/C,71,71) consisting of three prob- 
abilistic polynomial time algorithms. 

— /C is the key generation algorithm which, given a security parameter k (usu- 
ally viewed as a unary input 1^) produces from its random source uj a pair 
{pk, sk) of public and secret keys. 

— £1 is the probabilistic encryption algorithm which, given the security param- 
eter k, defines a message space M such that: for each string x from M, and 
for each valid public key pk, £pk{x) is a string y, called the encryption of x 
under pk. 

— T> is the (deterministic) decryption algorithm. It is required that for every 
message x in M. and for every pair {pk,sk) output by K, 'Dsk{£pk{x)) = x. 
In all other cases, the output of V is any element of A4 U {-L}. A ciphertext 
whose decryption is _L is said to be invalid. 
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A real-valued function /(n) is negligible if for any integer fc, |/(n)| < n~^ for 
sufficiently large n. 

Given a distribution i5 over a finite space 17, we let Pr^ \E] be the probability 
of an event E. When 5 is omitted, it is implicitly assumed that 5 is the uniform 
distribution. The support of S is the set of elements from 17 whose probability 
is non zero. Often, a random variable is conveniently defined by the output 
distribution of a probabilistic Turing machine. We let y ^ TM (a;) be the result 
y by running TM on input x and random source oa. If 5 is a finite set then 
2 / ■<— S' is the operation of picking an element uniformly in S. 

When considering several encryption schemes 7Ti,..,7T„ and their related 
algorithms, we will denote by /C", S" and 2?" the algorithms that given an input 
vector of n adequate data, output a vector of dimension n whose distribution is 
given by the product of the output distributions of /Ci x .. x /C„, Si x .. x 
and X .. X T>n respectively. We insist that all encryption schemes need not be 
identical. 

Our multicast notion enlarges the intuitive definition of broadcast when a 
unique plaintext is encrypted. In this paper, we consider a multicast communi- 
cation as a set of encryptions of suitably related plaintexts under different public 
keys. For example the reader might consider messages containing the name of 
the recipient followed by a possibly common text. Formally, a broadcast distribu- 
tion of plaintexts is any diagonal distribution whose support is in whereas a 
multicast distribution of plaintexts is any distribution whose support is in 



3 Indistinguishability 

3.1 Single-User Encryption Schemes 

Secure encryption should preserve privacy even in the critical context where the 
messages are taken from a small set of plaintexts: it should be impossible for an 
eavesdropper to distinguish encryptions of distinct values. Such a requirement is 
captured by the notion of indistinguishability, also known as semantic security 
II 31151. Examples, secure against chosen plaintext attack, include El Gamal m 
(based on the decisional Difhe-Hellman assumption ^Dl), Naccache-Stern m 
(based on higher residues) and Okamoto-Uchiyama PS| (based on factorization) . 
Our definition exactly follows |2] and uses the same notations. Indistinguisha- 
bility is defined by the advantage of an adversary A = (Ai,A 2 ) performing a 
sequence of two algorithms. 

In a first step, algorithm A\ is run on input of the public key pk and outputs 
two plaintexts messages and x^ plus a string s encoding information to be 
handled to A 2 . Next a message from {a;°, x^} is chosen at random and encrypted 
into a challenge ciphertext y. In a second step, A 2 is given the input (y, s) and 
has to guess the bit of the plaintext being encrypted. The advantage of A is 
measured by the probability that it outputs the correct bit of the challenge. The 
scheme is indistinguishable if no adversary obtains an advantage significantly 
greater than one would obtain by flipping a coin. The formal definition follows: 
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Definition 1. Single-user indistinguishability. 

Let n = (1C, 6,1)) be an eneryption scheme with a security parameter k and let 
A = (^ 1 ,^ 2 ) be an adversary. For k gN, we define the advantage: 



AdvA,n{k) 



2Pr 



{pk, sk) G- /C(l 

y 6pk{x^) : 



'=); {x°,x^,s) ^ Ai{pk)-, 
A2{s,y) = 5-1 



5 G- {0, 1}; 



We say that II is single-user indistinguishable (S-IND) if for every polynomial 
time adversary A, AdvA,n{k) is negligible. 



3.2 Multicast Encryption Schemes 

In the context of multicast, the usual notion of indistinguishability does not, 
by itself, guarantee that no bit of information is leaked when putting together 
the encryptions of related messages under different public keys. Our definition 
captures this stronger notion of security by giving the adversary the ability 
to choose two vectors of plaintexts whose coordinates are plaintext messages 
possibly related or even identical. Next, one of the two vectors is chosen at 
random and is encrypted coordinatewise with the different public keys. The 
final goal of the adversary is to guess which one was encrypted. This is easily 
done if a boolean function distinguishes the two vectors of plaintexts and is 
computable from the encrypted data. Again our formal definition is in terms 
of the advantage of an adversary playing the game just given. In the following, 
underlined variables denote vectors of size n; the coordinate refers to the i*^ 
cryptosystem. 

Definition 2. Multi-user indistinguishability. 

Let n = (K,,£,T>) be an encryption scheme with a security parameter k and let 
A = (Ai,A 2 ) be an adversary. For A:,n S N, we define the advantage: 



AdvA.n{k, n) = 2Pr (gfc, sfc) ^ /C"(l^); {x^ ,x^ , s) Ai{pk); 5 ^ {0, 1}; 



U^6pk(x‘’) : A2{s,y) = b 



- I 



We say that II is multi-user indistinguishable (M-IND) if for every polynomial 
time adversary A, AdvA^n{k,n) is negligible. 



3.3 Results 

As expected, any multi-user indistinguishable encryption scheme U is also single- 
user indistinguishable. Indeed, if an adversary distinguishes £pk{mP) from Epk 
{m}) then it obviously distinguishes two encrypted vectors whose first coordinate 
is the encryption of mP and mf under the public key pk. Also note that the usual 
definition of (single-user) indistinguishability, expressed in | 2 |, is the particular 
case of multi-user indistinguishability where n = 1 . The following result achieves 
equivalence. 
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Theorem 1. S-IND^M-IND. 

If encryption scheme II is single-user indistinguishable, then it is multi-user 
indistinguishable. 

Proof. Let A be an adversary attacking II in the sense of M-IND. We build n 
adversaries Bi = as follows: 



/\\gor\thmBii{pki): 

pk (pfci, ..,pki, ..,pkn) 

(x°,x^,s) Ai (pk) 
return xj,s) 

AlgorithmBj,2(yi,s): 
b' ^ {0, 1} 

U ^ {yi, -nyn) with yj = Spk,i{x’’') a j <i 

Vj = Spkj{x*') A j > i 

b" ^ A2 {u,s) 

return b” 



In a first step i?i,i extends pki to a vector of public keys pk, using (n— 1) times the 
algorithm 1C. Then Ai is run with the input j>k. The pair of plaintext messages 
output by Ai is returned, which completes the first part of the algorithm. We 
note b the unknown bit of the challenge, i.e. yi = SpkX^\). In a second step, Bi ^2 
extends its input yi to a hybrid vector y: the first coordinates of y come from the 
encryption of x^ whereas the last coordinates of y come from the encryption of 
xA . Bit b” output by running A2 on y is returned as an answer to the challenge. 

We now compute the advantage of Bi for pk, x^, X^ and s fixed. Let d be 
a random bit and let Pr^ (respectively Prj be the probability that the initial 
adversary A2 successfully guesses the plaintext of the left (respectively right) 
part of a hybrid ciphertext formed with i coordinates from x‘^ followed by (n — i) 
coordinates from x^'. 

Pr* = Pr [d ^ {0, 1}; c ^ , .., af, af+j, .., of); d' ^ A2{c,s) : d' = d] 

Pr' = Pr [d^ {0,1}; c ^ Spk{ai, ..,a'(,af^^, ..,af)-, d' ^ A 2 {c,s) : d' ^ d] 
Note that, 

Pr, + Pr' = l (1) 

We apply Bayes’ theorem, considering the value of the bit b' randomly chosen 
in the algorithm Bi 2' 

Pr [6 ^ {0, 1}; yi £pk^{xi); b" •<— 2 ( 2 / 1 , s) ^ b" = 6] 

= i Pr [6 ^ (0, 1}; j/, ^ £pki{^\)] b" ^ B,_2(di, s) : b" = b \ b' = b] 

+ i Pr [b ^ {0, 1}; y, ^ fpfe,(a:-); b" ^ B,_ 2 (di, s) : b" = b \ b' ^ b] 

= iPr*+iPL-i 



( 2 ) 
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It follows from (1) and (2) that the advantage of Bi is: 

Advs._77 = 2 (i Pi'i Pr'_i) - 1 = Pr^ - Pr^-i 
Middle terms cancel in the sum, so that: 

n 

Adv_Bi,77 = Pfn - Pro = AdVyi^Tj 

i=l 

Consequently, if i is uniformly chosen at random in {1, ..,n}, we obtain a re- 
duction from a multi-distinguisher attacker A with advantage e, to a single- 
distinguisher attacker B with advantage e/n. □ 

4 Non-malleability 

4.1 Single-User Non-malleability 

The notion of non-malleability was introduced in HH and formalized in a dif- 
ferent manner in Pj. The main idea is that, given an encrypted message y, an 
adversary is unable to output a ciphertext y' whose decryption is related to the 
decryption of y. More precisely, this goes along an interactive experiment with 
an adversary A = (4_i, A 2 ) which is described below. 

The Turing machine A\ is run with input of a public key pk and outputs 
the description of a probabilistic polynomial time Turing machine M, and a 
string s for further computation. The output of M defines a distribution of 
plaintext messages whose support is a set |M| C Ad. In the following M refers 
to the Turing machine as well as its output distribution. Then a message x is 
randomly chosen by running M and its encryption is given to A^. The goal of 
A 2 is to output a binary relation R over \M\x M. and a ciphertext y' ^ y whose 
decryption x' is related to x according to R. The scheme is non-malleable if for 
any adversary the probability that R{x, x') holds is not significantly better than 
the probability that R(x^x') for a random x from M. 

For notational convenience we have simplified the definition given in ^j. In 
the original paper, the goal of the adversary was to output a vector y' of t — 1 
ciphertexts related to y according to a relation R of arity t. In this case, it is 
required that no coordinate of y' is equal to y. It was also proven that both 
definitions were not equivalent. The former could not be reduced to the latter. 
In the rest of our paper we will only represent elements y' with one coordinate so 
that no confusion arises with vectors from the broadcast notation. But one can 
also build a similar theory of multi-user non-malleability for relations of arity t 
by considering the modified ciphertext as a vector of ciphertext vectors y' and 
an appropriate binary relation over \M\ x 

Recently, it was shown by Bellare and Sahai Q that non-malleability (in 
any attack model) was equivalent to indistinguishability where the adversary 
gets the additional power of “parallel ciphertext attack” (i.e. non adaptive ci- 
phertext attack after seeing the challenge encryption). Consequently, our first 
result may apply to this notion. However, we followed the standard definition of 
non-malleability and proved it may be simplified. 
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4.2 Multi-user Non-malleability 

Scenarii where it is unclear whether single-non-malleability is enough to ensure 
a satisfactory notion of security can be envisioned: for example, the view of 
different encryptions under several public keys might give the opportunity for 
an adversary to flip one of the encrypted message into its opposite. It is also 
not clear that encrypted messages sent to different users may not be exchanged. 
Thus, if one wishes to cover the standard context of multicast it is natural to give 
an extended notion of security for non-malleability which we now undertake. 

The adversary is given n public keys and outputs a probabilistic polynomial 
time Turing machine M plus a string s. By running M on a random source 
we require that its output defines a distribution of plaintext messages whose 
support \M\ is in Then, a vector x is randomly chosen by running M, and 
its coordinatewise encryption according to the different public keys is given to 
A2- The goal of A2 is to output a vector of ciphertexts y_' and a relation R over 
\M\ X A is successful if R relates the corresponding decrypted messages. 
The formal definition is given below. 

Remark. The exact support \M\ of M may not be computable in polynomial 
time. It is therefore only required that the relation R is defined on a subset of 
M-"^ X and covers \M\ x . 

Definition 3. Multi-user non-malleability. 

Let n = (/C, £, T>) be an encryption scheme with security parameter k and let 
A = (Ai,A 2) be an adversary. For k,n gN, we define the advantage: 

AdvA.n{k,n) = \SuccA,n{k,n) - SuccAM,${k,n) \ , 



where 



SuccA,n{k,n) =Pr 



{pk,sk)^IC^ {!’"); {M, s) Ai{pk); 
{R,u')^A2{M,s,y); x'^Vskiu') ■ -L^ x' A R{x,x') 



SuccA,n.${k,n)=Pr 



(^, sfc) {M, s) Ai{pk); x,x-^M; y-^£^{x); 

{R,u')^A2{M,s,y); x' ^VskW) ■ x' A R{x' ,x') 



with x', = I , for each i in { 1 , .., n} 

[Xiif y[fyi 

We say that II is multi-user non-malleable (M-NM) if for every polynomial time 
adversary A whose output is a distribution of plaintexts M and a relation R both 
computable in polynomial time then AdvA.n is negligible. 

The motivation to introduce a new variable x' was to restrict the domain 
of the random variable x for the coordinates left unchanged by A2. This is the 
condition in dimension n of the requirement y' y in dimension 1 , defined in | 2 |. 
This rule makes the adversary gain no advantage in partially copying a vector 
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of ciphertexts and outputting a relation whose value is true on domains of the 
form ((xo, (xo,..,*)). 

The usual notion of (single-user) non-malleability is the particular case where 
n is fixed to 1. 

4.3 Results 

The next result is the main technical achievement of our paper and leads to a 
simplified definition of non-malleability. It claims that the distribution of plain- 
texts M can be restricted to an atomic form. 

Lemma 1. Atomic non-malleahility. 

Let n he an encryption scheme and let A be an adversary attacking II in the 
sense of M-NM. Then there exists another adversary B attacking II, in the 
sense of M-NM such that the distribution of plaintexts that B outputs is always 
a uniform distribution of two vectors of plaintexts. Moreover, the running time 
of B is that of A plus the running time of the Turing machine M output by A. 

Proof. The adversary B = {Bi, B2) is defined as follows: 

Algorithm Algorithm s) 

{M,s)^Ai{pk) {R,u') ^ A 2 {u,s) 

a° ^ M; g} ^ M return {R,y') 

return s) 

Here the description of B2 is identical to A2 except that the relation R is re- 
stricted to the set {a°,a^} x Ad” instead of M x Ad”. We first claim that the 
input distribution of the ciphertexts is the same for A2 and B2. Indeed, using 
Bayes’ theorem and since x has equal probability 1/2 of being ag or ai, it results 
that for all X in M: 

Pr[a°,a^^Ad; x <— {a^,a^} : x = Xf\ 

= i Pr [fl° ^ M : g^ = X] + \ Pr[ai ^ M : g^ = X] 

= Pr [x f— Af : x = X] 

Consequently, SuccB,n = SuccA.n- Next, in order to express Succb^T 7^$ we 
decorelate x from x, considering its two possible values among {a°,a^}. Using 
the notations from definition 3, it holds: 

Pr [a°, Ad; x,xf— : R{x',x')] 

= 5 Pr [a°,Q^ Af; x,x f— {a°,a^} : i?(x', x') | x = x] 

-I-5 Pr [a°, Af; x,x f— {a°,a^} : i?(x', x') | x yf x] 

= 5 Pr [a°,Q^ Af; xf— {a°,a^} : R{x,x')] 

+ lPr\g°,g^ ^ M : 
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So, Succb = iSuccB,7i + |Succ^ 77 $ and Ad\/B,n = Succs^tj - SuccB,n,$ = 
^SuccB,n — |Succ7i.77^$. With the previous result, we conclude 

AdvB^TY = -Adv7i_77 



□ 



It is easily seen that the definition of single-user non-malleability is the restricted 
case of the multi-user non-malleability for n = 1 . The equivalence follows from 
the next result. 

Theorem 2. S-NM^M-NM. If encryption scheme II is single-user non-mallea- 
ble, then it is multi-user non-malleahle. 

Proof. Let A = {Ai,A2) be an adversary attacking II in the sense of multi-user 
non-malleability with an advantage e. Without loss of generality, as was shown in 
Lemma 1 , we assume that A\ outputs a uniform distribution M of two plaintext 
vectors oq and qi. We will build n Turing machine Bi,..,Bn attacking 77 in 
the sense of single-user non-malleability. For any i G { 1 , ..,n}, the description of 
Bi = (7771,7772) is as follows: 

Algorithm Bi^i{pki): 

pk ^ {pki, ..,pki, ..,pkn) 

{M,s) ^ Aifpk) 
return Mi = {a° , a\} 



Algorithm 77i_2(ci,s): 

b' ^ {0, 1} 

c ^ (ci, ..,c„) with Cj = Spkjia'^') A j <i 

Cj = Spkj (aj ) if j > i 

(o', R) •<— A 2 {c, s) 

Ri{af,u) R{a^,y) with vi = u 



return (c'i,Ri) 



Vj = RskiiCj) if j ^i 



As in the previous construction, the first part of the algorithm extends the 
input pki into a vector pk and calls the attacker Ai on this data. Without loss 
of generality, as was shown in Lemma 1 , Ai outputs a distribution M of two 
plaintexts Po and ai. Then both 7 *^ coordinates are returned. The algorithm 7772 
takes as input the ciphertext Ci of a plaintext aJ where b is an unknown bit. We 
focus on the way the binary relation Ri over {a?, a)} x A 7 is built from the initial 
relation R over {ao,Qi} x Ad”. Since the expression of the advantage of A only 
depends on the decryption of c, we let the 7 *^ coordinate free and fix the others to 
the decrypted coordinates of c thanks to the knowledge of the related secret keys. 
Thus Ri is the section of R on this particular sub-space. Note that, the exact 
definition of Ri may be ambiguous in the case where a° = aj and Here, 

it is clear that any attacker (even infinitely powerful) obtains a null advantage 
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since the encryption of is perfectly independent of the bit b. Thus in this 
specific case, the definition of Ri has little importance, and for convenience, it is 
defined by choosing b randomly so that the following computations remain true. 

We now fix j)k, ao and qi. The main goal is to analyze the behavior of 
the adversary A2 when its input is a hybrid vector of ciphertexts from ao and 
Qi. Let Pri (respectively Pr'j) be the probability that A2 successfully outputs 
a ciphertext related to the first (respectively last) part of the initial hybrid 
plaintext. 



Pri = Pr [6^{0,1}; 
Pr' = Pr [6^{0,1}; 



• ,a^,af+i,..,a*); (c',i?)^^2(c,s) : R{a'^ ,V^{c'))] 



Remark: If a? = a\ then a\ can be linked identically to the left part or the right 
part of the hybrid, hence Pr^ = Pr^-i and Pr( = Pr(_^. 



It follows from the above definitions that Pr„ = Prg and Pr(j = Prg. 
The success of the attacker Bi is: 



SuCCb._77 

= Pr [&,6'^{0,1}; cf-(ci,..,Ci,..,c„); (c',i?) <-^2(c,s) : i?(a^T’£fe(c'))] 

= iPr [6,6'^{0,1}; cf-(ci,..,c„); (c',i?) ^ ^2(0,5) : R{a}’ ,Vj^{d)) \ b' = b] 
+ iPr [6,6'^{0,1}; cf-(ci,..,c„); (c',i?) ^ ^2(0,5) : R{a}’ ,Vj^{d)) \ b' d b] 
= 5 Pl + 5 Pr'i-i 

The average success Succ is obtained by considering the four possible values of 
the B-bit b' and the random bit b relatively to the challenge bit b. Since h shares 
the vector c into a left part of z — 1 encrypted coordinates from h' and a right 
part of (n — 1 — z) encrypted coordinates from b' , whether b is equal to b' or b' 
leads to an hybrid vector c whose frontier is at position z or z — 1. In each case, 
whether the random bit b is the left or the right part of the hybrid vector c, 
leads to one of the expressions Pr or Pr'. 

Let the distribution: b = ib,b',b-(— {0,1}; (ci,..,Ci,..,c„); (d,R) A2(c,s)|. 






= Pr^ 



R(a\v^(d)) 





R(a^,Vj2k(d)) b=bA b' = b 


+ qPr^ 


R{g}‘ ,V^{d)) b = b Ab' db 


+ lPr6 


R(a'^,V^(d)) 1 bdbAb' = b 


+ iPr^ 


R{a\V^{d)) I'bdbAb'db 


^ yPE 


+ 3PL-1+ |Pr'+ iPr'.i 







It follows that the advantage of Bi is: 



Advs^ — Succs^ 77 — Succb^ 77 $ = iPr,+iPr'_i -iPr' -iPr,_i 
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Remark: if = aj then from the previous remark AdvSi = 0 as expected. 
Finally the sum is: 

n 

^ Advs, = i (Pr„ + Prg - Pr'„ - Pro) = 5 (Pr„ - Pro) = Adv^ 

i=l 

Thus, if i is randomly choosen in the set {1, n}, one obtains a reduction from 
a global adversary with advantage e to an adversary with advantage e/n against 
a single cryptosystem. □ 

Consequences of the results. In the case of adaptive chosen ciphertext attacks, 
it was proved by Bellare et al. |2| that both notions of indistinguishability and 
non-malleability are equivalent, and hence are also equivalent to the multi-user 
notions of security. Thus, our results show that some recent encryption schemes 
achieve a high level of multicast security requirement. In the random oracle 
model, one can mention the RSA-base OAEP 0 from Bellare and Rogaway. It 
was recently adopted as a standard of encryption in the PKCS#1 spec- 

ifications. In the standard model of proofs, only the Cramer-Shoup scheme 0 
achieves proven security and practical effectiveness. Finally, we point out some 
practical and straightforward applications of multi-user secure encryption. This 
includes pay-per-view television, where a part of the bandwith is used to broad- 
cast encrypted keys to each user. Secure electronic mail such as PGP is also 
given better confidence especially when adressing several recipients. One may 
also envision secure election protocols with a large number of independent au- 
thorities generally resulting in many related encrypted plaintexts. Lastly, multi- 
party computations usually use the assumption of a broadcast channel and thus 
should benefit from our multicast notions of secutity. 

5 Conclusion 

We have extended the applicability of two powerful notions of security: indistin- 
guishability and non-malleability. Every known attack is now covered by our new 
multicast security definitions. Furthermore, the reductions that we have shown 
have linear coefficients in the number of users. As a consequence, we believe 
that proven encryptions schemes with common single-user security parameters 
are ready to be safely spread over the Internet. 
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Abstract. This paper investigates one-round secure computation be- 
tween two distrusting parties: Alice and Bob each have private inputs to 
a common function, but only Alice, acting as the receiver, is to learn the 
output; the protocol is limited to one message from Alice to Bob followed 
by one message from Bob to Alice. A model in which Bob may be compu- 
tationally unbounded is investigated, which corresponds to information- 
theoretic security for Alice. It is shown that 

1. for honest-but-curious behavior and unbounded Bob, any function 
computable by a polynomial-size circuit can be computed securely 
assuming the hardness of the decisional DifSe-Hellman problem; 

2. for malicious behavior by both (bounded) parties, any function com- 
putable by a polynomial-size circuit can be computed securely, in a 
public-key framework, assuming the hardness of the decisional Diffie- 
Hellman problem. 

The results are applied to secure autonomous mobile agents, which mi- 
grate between several distrusting hosts before returning to their origina- 
tor. A scheme is presented for protecting the agent’s secrets such that 
only the originator learns the output of the computation. 



1 Introduction 

Suppose Alice has a secret input x, Bob has a secret input y, and they wish 
to compute g(x, y) securely using one round of interaction: Alice should learn 
g{x,y) but nothing else about y and Bob should learn nothing at all. Commu- 
nication is restricted to one message from Alice to Bob followed by one message 
from Bob to Alice. Without the restriction on the number of rounds, this is the 
problem of secure function evaluation introduced by Yao m and Goldreich et 
al. |n|. It is known that under cryptographic assumptions, every function can 
be computed securely and using a (small) constant number of rounds. 

The problem is closely related to the question of “computing with encrypted 
data” P2|: Alice holds some input x, Bob holds a function /, and Alice should 
learn f{x) in a one-round protocol, where Alice sends to Bob an “encryption” 
of X, Bob computes / on the “encrypted” data x and sends the result to Alice, 
who “decrypts” this to f{x). 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 512- K^ 2000. 

© Springer- Verlag Berlin Heidelberg 2000 
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The dual of this is “computing with encrypted functions,” where Alice holds 
a function /, Bob holds an input y, and Alice should get /(y) in a one-round 
protocol. This scenario has received considerable attention recently because it 
corresponds to protecting mobile code that is running on a potentially malicious 
host, which might be spying on the secrets of the code 

In the next paragraphs, honest-but-curious behavior is assumed before we 
turn to arbitrary malicious behavior. Honest-but-curious behavior models a pas- 
sively cheating party who follows the protocol, but might try to infer illegitimate 
information later on. 

Homomorphic Encryption and Computing with Encrypted Data. One popular 
approach to “computing with encrypted data” is to search for a public-key en- 
cryption scheme {E,D) with the following homomorphic property: given E{x) 
and E(y) one can efficiently compute E{x + y) and E{xy). Now, if Alice knows 
the private key D and sends Bob the public key E together with the encrypted 
data E{x), then Bob can without interaction compute E{f{x)) and send it back 
to Alice. Although this has been a prominent open problem for years PS], it is 
still unknown whether such homomorphic encryption schemes exist. On the one 
hand, Boneh and Lipton [7j have shown that all such deterministic encryption 
schemes are insecure; on the other hand, Sander, Young, and Yung m propose a 
scheme that allows the necessary operations on encrypted data, but comes at the 
cost of a multiplicative blowup per gate, which limits the possible computations 
to functions with log-depth circuits. 

Computational Assumptions. Note that the above approach to “computing with 
encrypted data” assumes a computationally bounded Bob, who cannot learn 
anything about the encrypted values. Alice, however, knows all secrets involved 
and seems not restricted in her computational power. Thus, the distinguishing 
feature of “computing with encrypted data” seems to be that it remains se- 
cure against an unbounded Alice. (In fact, the protocol of Sander et al. m is 
information-theoretically secure for Bob.) 

Assume instead that Alice, the receiver of the output, is bounded and Bob 
is unbounded and consider the same question: is there a one-round secure com- 
putation protocol for all efficiently computable functions? We give a positive 
answer in Section O any function computable by a polynomial-sized circuit has 
a one-round secure eomputation scheme in this model. The result is obtained by 
combining Yao’s “encrypted circuit” method for secure computation with a 
one-round oblivious transfer protocol Pj . To our knowledge, this is the first one- 
round secure computation protocol for arbitrary polynomial-time computations 
and gives a partial answer to the long-standing open question of computing with 
encrypted data mentioned above. 

If both parties are bounded, the above solution applies as well (we can even 
obtain stronger results, see below). Conversely, it is well known that secure com- 
putation between two unbounded parties with “full information” is impossible 
for arbitrary functions and limited to trivial functions g where g{x,y) gives full 
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information about y. The following table summarizes the current state of one- 
round secure computation (both supply input, only Alice receives output): 



Alice 


Bob 


securely computable functions 


reference 


unbounded 


unbounded 


only trivial ones 


BGW 0 


unbounded 


bounded 


log-depth circuits 


Sander et al. 


bounded 


unbounded 


polynomial-size circuits 


this paper 


bounded 


bounded 


polynomial-size circuits 


this paper 



Malicious Parties. We also investigate the malicious model, where both parties 
might be actively cheating. One cannot demand that Bob ever sends a second 
message, but if he does, and Alice accepts, the model ensures that Alice obtains 
g{x,y) for her input x and some y. We show that if Alice and Bob are both 
computationally bounded, then a one-round protocol exists also in the malicious 
model, provided they share a random string and that Alice has a public key for 
which she is guaranteed to know the private key. This is a realistic model, which 
is also used elsewhere (e.g., da). 

These results seem essentially optimal because one round of communication 
is needed to implement oblivious transfer m- 

Securing Autonomous Mobile Agents. One-round secure computation has been 
recognized as the solution for keeping the privacy of mobile code intact I2H- Here, 
a code originator O sends one message containing a protected description of the 
mobile code to host H, which “runs” the program and sends some output back to 
O, who decodes the output. (This is an instance of “computing with encrypted 
functions.”) The results in this paper on one-round secure computation directly 
yield mobile code privacy for all polynomial-time mobile computations. This is 
a vast improvement over both the solutions of Sander and Tschudin (which 
works for functions representable as polynomials) and the one of Sander et al. 
(which works for functions computable by log-depth circuits). 

In our solution the relative complexities of the computations by O and H are 
similar; for example, if H runs a long, complex computation with a short output, 
then O’s decoding work is proportional to the complex computation, despite the 
output being short. We do not know if there are general schemes with “small” 
decoding complexity for O. 

The above models are limited to mobile code that visits only one host, how- 
ever. In Section 0 a protocol is presented that allows an autonomous mobile 
agent to visit several distrusting hosts, which need not be fixed ahead of time. 
This flexibility is one of the main benefits of the mobile code paradigm. As with 
an unencrypted autonomous agent, the communication flow must correspond to 
a closed path starting and ending at O. The secure computation protocol involves 
constructing a cascade of Yao-style circuits by the hosts and its evaluation by 
O. No host learns anything about the agent’s or the other hosts’ secrets. 

Related Work. Protocols for two-party secure function evaluation between a 
bounded and an unbounded party have previously been proposed by Chaum, 
Damgard, and van de Graaf |ni and by Abadi and Feigenbaum p. The former 
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hides the inputs of one party information-theoretically and the latter hides the 
circuit information-theoretically from the other party (regardless of who receives 
the output). Both protocols have round complexity proportional to the depth of 
the circuit, however. 

The work of Abadi, Feigenbaum, and Kilian j2| on information hiding from 
an oracle assumes an all-powerful oracle that helps a user with insufficient re- 
sources in computing f{x) for his input x; the approach is to transform x into 
an encrypted instance y and have the oracle compute f{y) such that it learns 
nothing about x but the user can infer /(x) from f{y). The two main differences 
to our model are (1) that Bob may also provide an input and (2) that the oracle 
is limited to computing /(•). 

Feige, Kilian, and Naor [El consider a related model in which two parties 
perform secure computation by sending a single message each to a third party. 

2 Definitions 

Recall the three scenarios of one-round secure computation introduced above: 
computing with encrypted functions, computing with encrypted data, and secure 
function evaluation. Using a universal circuit for g in secure function evaluation, 
it is straightforward to realize the first two scenarios from the third one by 
supplying / as input (at the cost of a polynomial expansion) . An equivalence in 
the other directions is possible by letting f he g with one party’s inputs fixed. 

The remainder of this section presents definitions for one-round secure com- 
putation using secure function evaluation. Formal definitions may be constructed 
using the methods in mm and are provided in the full version of the paper. 

The security parameter is denoted by k and a quantity is called negligible 
(as a function of k) if for all c > 0 there exists a constant fco such that tk < ^ 
for all k > ko- Throughout we assume that the security parameter k, as well 
as other system parameters, are always part of the input to all algorithms and 
protocols. 

Honest-but- Curious Model. This definition captures one-round secure computa- 
tion if both parties follow the protocol. A scheme has to ensure correctness, 
privacy for Alice, and privacy for Bob. 

More precisely, a one-round secure computation scheme in the honest-but- 
curious model consists of three probabilistic polynomial-time algorithms Ai(-), 
A 2 {-,-), and such that (1) Vx £ A, Vy G y, if Ai(x) outputs (s,mi) 

and B{y,mi) outputs m 2 , then ^ 2 ( 3 , m 2 ) outputs g{x,y) with all but negligible 
probability; (2) there exists a simulator sitriBob that outputs (s, mi) such that 
Vx £ X , no efficient algorithm can distinguish between the distributions output 
by sirriBob and the output of Ai(x); (3) there exists a simulator sirriAiice that 
outputs (s,mi) such that Vx £ A and Vy £ 3^, if mi is computed from Ai(x) 
and m 2 from B{y,mi), then no efficient algorithm can distinguish between the 
distributions on (x,mi,m 2 ) induced by the real protocol and by sirriAiice- 

We say that the scheme is secure for bounded Alice and unbounded Bob if the 
distinguisher in (2) is an arbitrary algorithm and the one in (3) is polynomial- 
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time; similarly, we say it is secure for bounded Alice and bounded Bob if both 
distinguishers are polynomial-time algorithms. 

In the model above, Ai is Alice’s query generator that outputs a message 
TOi sent to Bob and a secret s, B is Bob’s algorithm that outputs message m 2 
that is sent to Alice, and A 2 is Alice’s decoding algorithm that interprets Bob’s 
answer using s. (All algorithms are for a fixed function g.) 

Malicious Model. The malicious model allows arbitrary behavior for (bounded) 
Alice and Bob. We must ensure that for every strategy of Alice, Bob’s reply 
does not reveal more to her about y than what follows from the function output 
g{x,y) on a particular x. Bob, on the other hand, must be bound to compute 
m 2 such that Alice can recover g(x, y) for her x and on some legal y, chosen 
independently from x, or have Alice reject. Intuitively, this can be solved by 
having both parties supply a zero-knowledge proof with their message that it is 
well- formed. However, a formal proof of security requires that these proofs are 
proofs of knowledge. To this end, we use a public- key model HU, where each 
party has registered a public key and a public source of randomness is available 
(see Section E3). 

3 Tools 

3.1 Oblivious Transfer 

A ubiquitous tool in secure computation is oblivious transfer. We use a one- 
out-of-two oblivious transfer also known as ANDOS (all-or-nothing-disclosure- 
of-secrets 0): a sender S has two input values op and oi, which are strings of 
arbitrary known length, and a receiver R has a bit c; R obtains Uc, but should 
not learn anything about ac©i and S should not learn c. 

Let G be a group of large prime order q (of length polynomial in k) such 
that p = 2q + 1 is prime and G C Zp and let y G G be a generator. (Note 
that this allows efficient sampling from G with uniform distribution.) Consider 
two distributions Dq and Di over G^, where Dq = (y,y“,y^,y^) with g G 
and a,/3 , 7 A Zq and Di = {g, g^" , g^ , g°^^) with g G and a,/3 A Zq. The 
Decisional Diffie- Heilman (DDH) assumption is that there exists no probabilis- 
tic polynomial-time algorithm that distinguishes with non-negligible probability 
between Dq and D\. 

The following is a sketch of the ANDOS protocol between a sender Bob and 
a receiver Alice 0, denoted OT(c)(ap, oi). Alice’s private input is a bit c and 
Bob’s private inputs are ap, ai £ G. Common inputs are p and g. 

1. Bob chooses S ^ G and sends <5 to Alice. 

2 . Alice chooses a t— Zq, computes Pc = g°‘, Pcoi = ^IPc and sends /3p,/3i to 
Bob. 

3. Bob verifies that PqP\ = 5 and aborts if not. Otherwise, he chooses rp,ri A 
Zq, computes (ep,/p) = {g^° , aoPo'^°) and (ei,/i) = (y"'L aiP/p, and sends 
(ep,/o,ei,/i) to Alice. 

4. Alice obtains Oc by computing /c/ec“. 
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It is easy to see that if both parties follow the protocol, Alice obtains Oc- Con- 
sider security for Alice: c is perfectly hidden from Bob, i.e., in an information- 
theoretic sense, because /Jq and j3\ are uniformly random among all group ele- 
ments with product 5. Thus the protocol is secure for Alice against an arbitrarily 
behaving unbounded Bob. 

Consider security for Bob. In the honest-hut- curious model, Alice chooses Pq 
and Pi honest, i.e., such that Pc®i is a random public key and, under the DDH 
assumption, (ec©i,/c©i) is a semantically secure encryption of ac©i. Hence the 
protocol is secure for Bob against a bounded Alice. Furthermore, Step 1 of the 
protocol is not even needed and Alice may compute <5 ^ G herself in Step 2. The 
resulting protocol, denoted by OT-l(c)(oo, oi), has only one round of interaction. 

Assuming malicious behavior, a one-round version is also possible in the 
public-key model using shared random information cr; this version is denoted 
by OT-2(c)(ao, tti). Here, Step 1 can again be omitted and Alice chooses d her- 
self, using the sampling algorithm in G with a as random source. Intuitively, 
she then sends 6 along with Po,Pi to Bob, who verifies that the choice of S 
is correct according to cr. However, Alice must also supply a “non-interactive 
proof of knowledge” of a, the discrete logarithm of either Pq or Pi (we refer to 
Section 14., 51 for how this can be done) . With these changes, the protocol can be 
proved secure for Bob against an arbitrarily behaving bounded Alice. 



3.2 Encrypted Circuit Construction 

Yao’s encrypted circuit construction implements secure function evaluation be- 
tween Alice and Bob such that Alice receives the output z = g{x, y) m- 

We give an abstract version of Yao’s construction describing only those 
properties essential to our analysis. A more detailed treatment of Yao’s pro- 
tocol is found in the literature (e.g., |ES])- Let {xi, . . . ,Xn„,), (yi, ■ • ■ ^Vny), and 
(zi, . . . , Zn„) denote the binary representation of x, y, and z, respectively, and let 
C denote a polynomial-sized circuit computing Yao’s construction con- 

sists of three procedures: (1) an algorithm construct that Bob uses to construct 
an encrypted circuit, (2) an interactive protocol transfer between Alice and Bob, 
and (3) an algorithm evaluate allowing Alice to retrieve g{x, y). Additionally, the 
proof of security requires a simulation result. 

More precisely, the probabilistic algorithm construct(G, y) outputs the values 
C, (ATpo, ATpi), . . . , AT„^.i), (Gpo, Gpi), . . . , The first part 

C is a representation for C, with input y hardwired in. It may be viewed as an 
encrypted version of the na,-input circuit C{-,y). In order to compute C{x,y), 
one needs a A:-bit key for each input bit xy, the key corresponds to the key 
used for the input Xi = b. The pairs {Ui^o, Ui,i) represent the output bits, i.e., if 
decryption of the circuit produces Ui^, then the output bit Zi is set to b. 

The transfer protocol consists of Ux parallel executions of ANDOS. In the 
f-th execution. Bob has input Ki^i) and Alice has input Xi. That is, Alice 
learns Ki x^ , ■ ■ ■ , , but nothing more, whereas Bob learns nothing about 

xi,. . . Bob also sends C and {Uifi, C/iy), . . . , {Un^,o, Un^,i) to Alice. 
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The algorithm evaluate(C, , • ■ ■ , ) outputs either a special symbol 

reject or , ■ ■ ■ , • From the latter Alice can recover z, and if Alice 

and Bob obey the protocol, then z = g{x^ y). 

A key element of the security analysis is the existence of a polynomial- 
time simulator s\xr\y 3o{C, x, g{x,y)) that outputs a tuple C, , . . . , , 

{Uifi, ffip), . . . , Un^^i); the distribution of the simulator’s output is com- 

putationally indistinguishable from that induced on these same variables by 
construct(C', y) and x. Intuitively, given x and g{x,y), the simulator can simu- 
late Alice’s view obtained by running Yao’s protocol with an ideal (information- 
theoretically secure) ANDOS. 

The existence of construct, evaluate, and simyao may be based on the existence 
of pseudo-random functions m-, efficient implementations of pseudo-random 
functions can be based on the DDH assumption PI- 



4 One-Round Secure Computation 
for Polynomial-Size Circuits 

The basic idea of our one-round secure computation protocols is to combine the 
one-round oblivious transfer protocols with the encrypted circuit construction. 



4.1 Honest Behavior 

In the honest case, we use the one-round oblivious transfer protocol OT-1 and 
send Bob’s reply in OT-1 along with the encrypted circuit computing g. The 
resulting scheme consists of the three following algorithms Ai, A 2 , and B (using 
the notation above). 

Ai{x)'. Compute the first messages of Alice for nx parallel oblivious transfer 
protocols: Let (3q \ be computed as in Step 2 of protocol OT-1 
with input Xi of Alice for i = 1, . . . , Ux- Output s = . . . , and 

mi = ((^W, /?«,/?«),..., 

B{y,mi): Invoke construct(C, y) to obtain (C, (ATpo, ATi^i), . . . , Ar„^p), 

{Uifi, Dip), . . . , (Un^,o, Un^,i))- Next, for each i = I, . . . , Ux, execute Step 3 
of protocol OT-1 using (3q\ ) (taken from mi) and with Bob’s in- 

puts (ao,ai) set to {Ki^, Ki^i). (Provided |G| is sufficiently large, such an 
encoding of binary strings in G is possible.) Denote the output of this step 
by m 2 ,* = Output m 2 = (C, m 2 ,i, . . . , m 2 ,„„, (Di,o,Di,i), 

■ ■ ■ j(Dri2,0j 

^ 2 ( 5 , m 2 ): For i = l,...,nx execute Step 4 of protocol OT-1 using Xi as c, 
{e-xhfxf) (taken from m 2 ) as (Cc,/c), and (taken from s) as a, hence 
recovering Ki^xi , • ■ ■ , Kn^^x„^ ■ Finally, invoke evaluate(C, Ki^xi , ■ ■ ■ , Kn^^x „^ ) 
to obtain Di,zj, . . . , Un^,z„^ and output z. 
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4.2 Analysis of the Honest-Party Case 

Our description of Yao’s protocol assumed an ideal implementation of ANDOS. 
We now analyze the above protocol using the oblivious transfer protocol OT-1. 
When both parties are honest, the combined protocol’s correctness follows easily 
from the correctness of Yao’s protocol and the oblivious transfer protocol. To 
show privacy, we construct simulators for Alice’s and Bob’s views. Note that 
this separation of privacy and correctness is not valid for parties with arbitrary 
behavior. 

Let View Aiice{x,y) and View Bob{x,y) denote Alice’s and Bob’s view of the 
protocol {C is always a fixed common input, and is dropped for notational con- 
venience). We must simulate each player’s view given only the information they 
are supposed to learn. That is. Bob is allowed to learn y, and Alice is allowed to 
learn x and g{x,y). 

To simulate View Bob(x,y), we define simulator simBob(2/) as follows: 

1. Choose Alice’s input x = 0"®. 

2. Engage in the secure function evaluation protocol with Bob where the simu- 
lated Alice plays as she would be given x. Return the view obtained by Bob 
during the execution of this protocol. 

Lemma 1. For all values of (C,y), simBob(2/) and View Bob{y) are identically 
distributed. 

The proof follows from the fact that in every execution of the OT-1 sub-protocol, 
Alice’s message is independent of her input. 

Next, we simulate View AUce{x^y). Define simAiice(a;, J/)) as follows: 

1. The simulator invokes the simulator simYao(C', x, g(a;, y)) so as to obtain 

^l,xi ) • ■ • ) ^rix,Xn^ ) Ui i), . . . , (Un^,0> 

2. For i = 1,. . . , Ux, the simulator chooses Ki^xtm = 0^- 

3. The simulator engages in the protocol transfer with Alice exactly as would 

Bob, given input pairs (Aii.o, Ki^), ■ ■ ■ , encrypted circuit 

C. The simulator returns Alice’s view of this protocol. 

Lemma 2. For all values of x and y, View AUce{x,y) and s'lm M\ce{x, g{x,y)) are 
computationally indistinguishable. 

The proof works by a hybrid argument (omitted). Our first result follows. 

Theorem 1. Under the DDH assumption, (Ai,A 2 ,B) are a one-round secure 
computation scheme in the honest-but- curious model, with perfect security against 
unbounded Bob. 

4.3 Allowing Malicious Behavior 

For polynomially bounded, arbitrarily malicious parties, we obtain secure one- 
round computation in a model with certified public-keys and public randomness. 
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First, because we can no longer trust Alice to choose <5 at random, we replace 
protocol OT-1 by protocol OT-2 (using public randomness) in the above con- 
struction. Then Bob must prove that his messages in OT-2 are consistent with a 
correct construction of C, {(Ki^g, and Alice must prove that she knows the 

discrete logarithm for one element of each pair {j3g\j3i'^) (e.g., using a result 
by Cramer et al. m)- In the security proof for protocol OT-2 one extracts the 
discrete logarithms from Alice and thereby obtains her input x (xt corresponds 
to the element of of which Alice knows the discrete logarithm). 

A fallacious step would be to use a public random string to implement non- 
interactive zero-knowledge proofs (NIZKP) Itilldl that each player’s message is 
well formed. The formal complication to this method is that “standard” NIZKP 
are not proofs of knowledge. Instead, we use the “public-key” scenario for non- 
interactive proofs of knowledge, put forth by Simon and Rackoff as follows. 
Each player has a public-key P that is certified by some trusted center once and 
forever. The player convinces the center, via a standard zero-knowledge proof 
of knowledge, that he knows the corresponding secret key S for P. Henceforth, 
the secret key is assumed available to the simulator/extractor. To make a non- 
interactive proof of knowledge of the solution to some problem in NP, the player 
simply encrypts, using P, whatever it is he wishes to show knowledge of, and 
then non-interactively prove (using standard NIZKP) that the encryption, if 
decrypted, would yield a solution to the problem. The extractor, who knows S, 
can then recover the solution as well. Details are omitted from this extended 
abstract. 

5 Securing Autonomous Mobile Agents 

The mobile agent paradigm has several attractive features. One of them is the 
flexibility of delegating a task to an autonomous agent, who roams the net, visits 
different sites, collects information, computes intermediate results, and returns 
to the originator only when the computation is finished. No interaction with the 
originator is needed in-between. 

Sander and Tschudin m recognized that mobile code can be protected 
against a curious host using the approach of “computing with encrypted func- 
tions.” However, their solution addresses only the case of agents who return 
home after visiting one host. We consider autonomous agents here that leave 
the originator without a fixed list of hosts to visit in mind and consider the 
question: How does the agent migrate securely from one host to another? 



5.1 Model 

More formally, there is the agent’s originator O and £ hosts Hi, . . . , that run 
the agent. The state of the agent is represented by some x G X. The initial 
state is chosen by O. All that is known about the computation is represented by 
gj : X X y ^ X associated with Hj, which updates the agent’s state according 
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to Hj's, secret input yj. This models arbitrary polynomial-time computations 
provided the functions gj are representable by polynomial-size circuits. 

A novel feature of our protocol is that neither the hosts nor the path taken 
by the agent need to be fixed or known beforehand. Only for the simplicity of 
description do we assume that the agent travels from O to host Hi, then from 
host Hj to host Hj+i for j = 1, — 1 and then from Hi back to O; the 

generalization can be derived easily. The agent is autonomous because either a 
host may decide where to send the agent next or because the agent, besides the 
encrypted part, consists also of conventional code that computes where to go 
next based on non-private information. (Note that migration decisions cannot 
depend on the private state of the computation, as they would be observable by 
the hosts and thereby leak information about the internal state!) 

A scheme for secure computation hy autonomous mobile agents consists of 
efficient algorithms Ai(-), A 2 {-, •), Bi{-, •),..., Bi{-, ■). The agent’s computation 
proceeds as follows: first, O runs Ai(x) on input x and thereby obtains a secret 
s and a message mo, which O sends to Hi; likewise, for j = Hj runs 

Bj{yj,rrij-i) on input yj, message mj-i and obtains mj, which it sends to ffj+i 
(with the exception that Hi sends mi to O). Finally, upon receiving mi, O 
obtains the desired result by invoking A 2 {s,mi). We require: 

Correctness: Vx G A and Vj/j G y, the decoding algorithm A 2 {s,mi) outputs 
Z = gi{- ■ ■ g2{gi{x,yi),y2) ■ ■ ■ ,yi); 

Privacy: (1) the inputs and computations of the visited hosts remain hidden 
from the other hosts: for all j, message mj does not give information about 
X and yj> for j' < j; (2) the originator should learn only the output of 
the computation, but nothing else about the inputs of the hosts: Vx G A 
and \/yj G y, (j = given only x, s, and z (as above), mi can be 

simulated efficiently. 

Honest-but-curious behavior is assumed on behalf of all parties throughout this 
section (dishonest behavior can be prevented analogously to the two-party case) . 

5.2 Protocol 

Our protocol for secure computation by autonomous mobile agents is an ex- 
tension of the one-round secure computation protocol in Section l4. 1 1 to multiple 
hosts, which take over the part of Bob. O proceeds as Alice, sending the first mes- 
sage and receiving the encrypted circuit computing gi{- ■ ■ {gi{x, yi), t/ 2 ) ■ • • , y^)- 
Each host Hj contributes the part of encrypted circuit representing its function 
gj] thus the resulting encrypted circuit is a cascade of sub-circuits. Hi generates 
the key pairs representing O’s input and computes the answers for the oblivious 
transfer protocol; these are attached to the computation and reach O with the 
message from Hi. To extend the cascade of sub-circuits, Hj encrypts each input 
key of its sub-circuit with the corresponding output key from the preceding sub- 
circuit. This is done using a symmetric encryption algorithm encif(-), realized 
in the same way as the encryptions for single gates in Yao’s construction; in 
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particular, this scheme has the property that given a key K, one can efficiently 
check if a ciphertext represents an encryption under key K. 

We describe algorithms Ai, A 2 , and Bi, . . . ,Bi using notation from above. 

Ai{x): Compute the first message (of Alice) for Ux parallel oblivious transfer 
protocols. This results in s = . . . , and fho = 

. . . , computed as in OT-1. Output s and mo = (mo,0). 

Bj{yj,rrij-i): Invoke construct(Cj, j/j) to obtain 



1^1,0! -^1, lb 



,{K 



(i) 



t.O’ 



\ (ij(A tj^A\ 
^rix.lb Wl,0 > '-^1,1 b 






If j = 1, then execute Step 3 of protocol OT-1 using /3 q*\ ( taken 
from mo) and with Bob’s input set to (A"! q\ AT|\^). Denote the output of the 
OT-1 step by m^*) = (cg \ /j*^). Set mi = . . . , m("^\Ci) and 

output mi = (mi, (C/[g\ [/{\^), . . . , uil\i))- 

If 1 < J < then the outputs of Cj-i are recoded as inputs to Cj. To 
this end, for i = 1, ... ,Ux do the following: choose a random bit Ci and, for 
b G {0,1}, encrypt key under (taken from rrij-i) as V) 

r{j)\ T/Cf) 



enCyO- 1 ) Next, set m^ = (C}_o^ V}'}''), ■ • ■ , (K 

then output m^ = {mj, 



U) 



(]) 



rix,0'> ^rix ,1 



)) and 



^ 2 ( 5 , mi): Run Step 4 of protocol OT-1 and obtain input keys > ■ • • > 

of Cl . Now, run algorithm evaluate(Ci , k[^1^ , . . . , Kn},xn^ ) to obtain the out- 

(2) 

put keys of Ci. Each one of these decrypts one ciphertext V} to an input 
key of C 2 , which can then be evaluated and then will allow to decrypt the in- 
put keys of C 3 . Proceeding similarly for all circuits C 3 , . . . ,Ci will eventually 
reveal [/} , . . . , UnJ,z„^ from which the result z can be retrieved. 



As for the security of the protocol, note that each host sees an encrypted 
circuit representing the computation so far, like Alice in the original protocol 
but lacking the secrets to decrypt the oblivious transfers. A simulator for each 
host’s view is straightforward. When the encrypted circuit reaches O, it consists 
only of information that has been constructed using the same method as in the 
original protocol; thus, the security follows from the original argument. 
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Abstract. We present the first optimistic n-party contract signing pro- 
tocol for asynchronous networks that tolerates up to n — 1 dishonest 
signatories and terminates in the minimum number of rounds (0(n)). 
We also show how to make this protocol abuse-free by using standard 
cryptographic primitives (digital signatures, public-key encryption) only. 
Previous solutions required 0{v?) rounds of communication, and non- 
standard cryptographic primitives for abuse freeness. 



1 Introduction 



A contract signing protocol is a protocol that allows n signatories to sign a 
contract text such that, even if up to n — 1 of them are dishonest, either all 
honest signatories obtain a signed contract, or nobody obtains it |2I8| . 

Dishonest signatories can arbitrarily deviate from their protocol (Byzantine 
model). We assume an asynchronous network: all messages are eventually deliv- 
ered, but there are no upper bounds on network delays 

Multi-party contract signing has obvious applications in secure electronic 
commerce, and is the basis for the solution of many related fairness problems, 
like multi-party certified mail P3- 

By using a third party, T, that is always honest, the problem can be trivially 
solved UHl: T collects a digital signature from each of the n signatories, and 
either redistributes the n signatures, or aborts the contract in case not all n 
signatures arrive. Security depends fully on T; therefore most research has been 
focused on getting rid of T as trust and performance bottleneck. 

Unfortunately, one cannot get rid of T completely: For n = 2 no deterministic 
protocol without third party exists HH, and each probabilistic protocol has an 
error probability at least linear in the number of rounds These results apply 
also to the n-party case, provided a majority of all signatories might be dishonest. 

Therefore our goal is to minimize the involvement of T as much as possible: 
Optimistic protocols depend on a third party T, but in such a way that T is not 
actively involved in case all signatories are honest; only for recovery purposes T 
might become active ItllTIltil . 

Optimistic contract signing protocols have been first described for syn- 
chronous networks |2rillbll7| . 2-party protocols for asynchronous networks have 



Actually we do not need to require eventual delivery for all messages; see Sect. 0 
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been described in mniE]. The n-party case was first investigated in 0. Inde- 
pendently, but subsequently, protocols for the 3-party and n-party case were 
proposed in m and resp. In Sect. 0we give a precise definition of the 
asynchronous n-party contract signing problem. 

In Sect. 0we describe and prove our new asynchronous optimistic n-party 
contract signing protocol. The protocol is optimistic in case all signatories agree 
to sign the contract. It requires f -I- 4 rounds of communication in the worst 
case, t + 2 rounds in the optimistic case, and 0{tn?) messages, where t < n is 
a protocol parameter describing the maximum number of dishonest signatories. 
A variant requires 2t -|- 6 rounds but only 0(tn) messages. For t = n — 1 any 
such protocol requires 0(n) rounds [l,'-{j . thus, our protocol is asymptotically 
round- optimal. No previous protocol achieved this bound; the protocol proposed 
in requires O(n^) rounds and 0{n^) messages. 

In Sect. 0 we add abuse freeness to the protocol of Sect.0 A protocol can 
be abused if at some point in time the dishonest signatories fully control the 
result (i.e., depending on how they behave the result will be signed or failed, 
irrespective of the honest signatories’ behavior) and can prove to an outside 
party that they can force the result to be signed H2|. 

Two-party abuse- free protocols have been introduced in m and, although 
without mentioning this feature, in 0. The 3-party and n-party cases were 
considered in m and US!, respectively. 

Our construction for abuse freeness is significantly simpler than the con- 
struction in |I2in|: it uses standard cryptographic primitives only (public-key 
encryption, digital signatures), i.e., unlike |1 211 3j we do not need specific primi- 
tives. The construction preserves the round optimality. 



2 Model and Notation 

Parties. Let P\,. . . ,Pn denote the signatories, T a third party, and V\, . . . , Vn' 
the potential verifiers of signed contracts. Each party represents a machine that 
executes a certain protocol and that servers a specific user (e.g., human being or 
another protocol). Parties can receive inputs from their users (e.g., a signatory 
the command to sign a certain contract) and can generate certain outputs for 
their users (e.g., the result of the protocol). 

The signatories Pi and third party T are able to digitally sign messages, and 
all parties are able to verify their signatures The signature on message m 
associated with Px {T) is denoted by signjjf(77i) (sign 7 .(m)). In our complex- 
ity analyses we assume that sign(?7r) has constant length, independent of the 
length of m (e.g., because m is hashed before signing HSI). We abstract from the 
error probabilities introduced by cryptographic signature schemes and assume 
that signatures are unforgeable (the translation into a cryptographic model is 
straightforward) . 

Adversary model. We assume that up to t (for a given t < n) oi the n 
signatories, and for some requirements also T and all verifiers might be dishonest. 
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Dishonest parties can behave arbitrarily and are coordinated by a single party, 
called the adversary. Security properties must always hold for all adversaries. 

Network. All messages sent to or from T or any Vi are reliably delivered, 
eventually. We do not require any particular level of synchronization, nor do we 
assume that messages are delivered in order. The decision on which of all sent 
messages to deliver next is taken by the adversary. The adversary can read all 
messages from, and can insert additional messages into all channels. 

All-honest case. This is a special case where we assume that all n signatories 
are honest and all messages sent are reliably delivered, eventually. In Sect. 0 
we assume, additionally, that in this case, the adversary does not insert any 
messages into the channels between signatories 0 

We make use of a few conventions, in order to simplify presentation: 

Timeouts. If we say that “party X waits for a certain set of messages, M , hut 
can stop waiting any time” we mean more precisely the following: X accepts a 
special input wakeup from its user, for each protocol execution. “Waiting for M” 
means that X continues to receive messages but does not proceed in the protocol 
run until either all messages in M have been received, or wakeup is input. (In 
case wakeup is input before X starts waiting for M it proceeds immediately.) 

Initial inputs and protocol parameters. We assume that the parties and their 
public keys are known and fixed in advance. 



3 Definitions 

Definition 1 (Asynchronous Multi-party Contract Signing). An Asyn- 
chronous Multi-Party Contract Signing Scheme (asynchronous MFCS) consists 
of two protocols: 

— sign[Pi, . . . , P„], for signing a contract with signatories Pi,...,P„. sign)] 
might involve an additional party, T, in which case we call it an MFCS 
with third party. 

— verify[Pi, Vj], for showing a signed contract to any verifier, Vj, j G {!,..., n'}. 
verify)] never involves T, i.e., it is always a 2-party protocol only. 

A signatory Pi starts sign)] on input (sign, tidi, contri)^ tidi is a transaction 
identifier unique for all executions of sign)], contri is the contract text to be 
signed. 

Upon termination sign)] produces an output {tidi, contri, di) for Pi, with di G 
{signed, failed}. We will simply say “Pi decides di.” Pi may receive an input 
(wakeup, tidi) any time. 

^ This assumption is not needed for a weaker version of abuse freeness; see 0. 

® If the user of Pi does not wish to sign then Pi does not participate in sign)]. One 
could also allow an additional input decisiorii G {sign, reject}, indicating whether 
the user wants to sign or not, and assume that in the all- honest case each signatory 
receives an input and participates in sign)]. This would enable faster termination in 
case one signatory gets input reject. 
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A signatory Pi starts verify[] with verifier Vj on input (show, Vj, tidi, contri). 
A verifier Vj starts verify[] on input (verify, tirfy, confry). Upon ter- 
mination verify[Pi, Vj] produces an output (tidy , contry , dy) with dy G 
{signed, verify Jailed} for Vj. We will simply say “Vj decides dy.” No output 
is produced for Pi. Vj may receive an input (wakeup, tidy) any time. 

The following requirements must be satisfied: 

(Rl) Correct execution. In the all-honest case, if all signatories start with the 
same input (sign, tid, contr), and no signatory receives (wakeup, tid), 
then all signatories terminate and decide signed. 

(R2) Unforgeability. If an honest Pi never received input (sign, tid, contr) then 
no honest Vj that receives input (verify, Rj/, tid, contr), for any Py, will 
decide signed. (Note that this does not assume an honest T.) 

(R3) Verifiability of valid contracts. If an honest Pi decides signed on in- 
put (sign, tid, contr), and later Pi receives input (sho\N,Vj,tid, contr) 
and honest Vj receives input (verify, fjd, contr) and does not receive 
(wakeup, tid) afterwards then Vj will decide signed. 

(R4) No surprises with invalid contracts. If T is honest, and an honest Pi 
received input (sign, tid, contr) but decided failed then no honest verifier 
Vj receiving (verify, Py, tid, contr), for any Py, will decide signed. 

(R5) Termination o/sign[j. If T is honest then each honest Pi that receives 
sign and wakeup (for the same tid and contr) will terminate eventually. 

(R6) Termination o/ verify)]. Each honest Vj that receives verify and then 
wakeup, for the same tid, and each honest Pi that receives show will 
terminate eventually. 

Definition 2 (Optimistic Protocol). An MFCS with third party T is called 
optimistic on agreement if in the all-honest case, if all signatories receive input 
(sign, tid, contr) and none receives (wakeup, tid) the protocol terminates without 
T ever sending or receiving any messages. 

It is called optimistic on disagreement if in the all-honest case, if some sig- 
natories do not receive input (sign, tid, contr) the protocol terminates without 
T ever sending or receiving any messages. 

It is called optimistic if it is optimistic on agreement and on disagreement. 

Definition 3 (Abuse freeness |12p. An MFCS is called abuse- free if it is 
impossible for the adversary at any point in the protocol to be able to prove to 
an outside party that he has the power to control the contract, i.e., to terminate 
the protocol with result signed or failed. 

Remark 1. Def. E captures plain contract signing only. We do not require addi- 
tional features, like fair exchange of pre-defined signatures Pj or invisibility of 
T ^0|. If required, they can be implemented on top of contract signing. 

4 Asynchronous Optimistic Multi-party Contract Signing 

The following Scheme Q solves the multi-party contract signing problem, and is 
optimistic on agreement. 
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Ignoring all details, the protocol works as follows: It consists of t + 2 lo- 
cally defined rounds. In Round I each signatory that starts the protocol signs a 
“promise” to sign the contract and broadcasts this promise. In each subsequent 
round each signatory collects all signatures from the previous round, counter- 
signs this set of n signatures, and broadcasts itfl The result of the {t -I- 2)-nd 
round becomes the real contract. 

A signatory who becomes tired of waiting for some signatures in some round 
sends the information received so far to the honest T, stops sending further 
messages and simply waits for T’s answer. T analyses the situation and decides 
either failed or signed: 

If the first request received by T comes from a signatory in the first round 
then T must decide failed. If T receives a request from the last round then T must 
decide signed, as other signatories might already have the signed contract. Thus 
if T receives requests from all rounds then somewhere in the middle between the 
first and the last round T might have to change the decision from failed to signed. 
But T must do this only if it is clear that all requests that were answered with 
failed came from dishonest signatories — otherwise two honest signatories might 
decide differently. 

Scheme 1 (Asynchronous Optimistic Multi-party Contract Signing) 
Protocol “sign” for honest Pii 

The protocol is given by the following rules; Pi applies them, sequentially, 
until it stops. Let Ci := {tidi, contri). 

The protocol will proceed in locally defined rounds. Let r := 1 a local round 
counter for Pi, and let raised _exception := false a Boolean variable indicating 
whether Pi contacted T or not. Both are initialized before executing any rule. 
Let Moy := nil, for all i. 

— Rule SI: If raised ^exception = false and r = I then: 

• Pi sends mi^i := sigPj(ci, 1, prev_rnd_ok) to all signatories. 

• From all received messages of type TOij- it tries to compile full and 
consistent vectors ^ := (mi i, . . . , mi_„) and Xi i := Mi i with 
mij- = sign^(ci, 1, prev_rnd_ok). (NB: here we also check that all sig- 
natories agree on a.) If this succeeds Pi sets r := 2. 

At any time Pi can stop waiting for any missing mij (i.e., the user of Pi 

might enter wakeup any time), in which case it sets raised -exception := true 

and sends resolvei^i := (1, i, signj(mi^i, resolve)) to T. 

— Rule S2: If raised -exception = false and 2 < r < t + 2 then: 

• Pi sends m^y := (signj(Mj._i_j, r, vec_ok), signj(ci, r, prev_rnd_ok)) to all 
signatories. 

• From all received messages of type rrirj it tries to compile full and con- 
sistent vectors 

Mr,i = (signi(ci, r, prev_rnd_ok), . . . ,sign„(ci, r, prev_rnd_ok)) 

The real protocol does this more efficiently, in order to keep the messages short. 
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Xr,i ■■= (sigrii(Mr_i,j,r,vec_ok), . . . ,sign„(Mr_i,i,r,vec_ok)) 

If this succeeds and 

r < t + 2 then it sets r := r + 1; 

r = t + 2 then it decides signed, sets Ci := M^^i, and stops. 

At any time Pi can stop waiting for any missing rrirj. In this 
case Pi sets raised ^exception := true and sends resolvcr^i := 
{r,i,s\gni{Xr-i,i,reso\\/e) , Xr-i,i, Mr- 2 , i) to T. 

— Rule S3: If raised ^exception = true then: 

Pi waits for a message from T (without sending or receiving any further 
messages). This can be any of signedr,i = {resolver', j,s\gr\rp{c,r',j, signed)) 
or abortedr,i = signj^(ci, r, i, aborted). On receiving one Pi decides as indi- 
cated by T and stops; if it decides signed it sets Ci := signedr,i- 

Protocol “sign” for third party T: 

We assume T receives a message resolver,i, as defined in the protocol for Pi. 
Let c be the (unique) pair {tid, contr) contained in resolver, i. 

If this is the first time T is asked about this contract (i.e., tid) then T initial- 
izes a Boolean variable signed := false and two sets con := 0 and abort_set := 0. 
signed indicates T’s current decision, signed or failed, con is the set of all indices 
of signatories that contacted T, and abortset is the set of all failed-messages 
sent by T. 

Processing resolvCr,i cannot be interrupted, i.e., it cannot happen that T 
processes two different resolvCr,i concurrently. 

— Rule TO: (T accepts only one request from each Pi. All other requests are 
ignored.) If f G con then the message resolvCr,i is ignored. 

— Rule Tl: (IfT receives a request from Round r = 1 and has not yet decided 
signed then T sends an abort to the requester.) If i ^ con and signed = false 
and r = I then T sets 

abortedr,i '.= signj.(c, r, i, aborted) 
abort-set := abort.set U {abortedr,i} 
con := con U {i} 
and sends abortedr,i to Pi. 

— Rule T2: (IfT receives a request from a Round r > 1 and has not yet de- 
cided signed then T checks whether all previous requests came from dishonest 
signatories, using Lemma Q|. If this is the case then it changes the decision 
to signed, otherwise T sticks to aborted. j 

If * ^ con, signed = false, r > 1, and 
if for all abortedg,k S abort set we have s < r — 1 
• then: If con = 0 then T sets firstsigned := 

(resofoer,i,sign 2 .(c, r, ijSigned)). T sets 
signedr i := firstsigned 
signed := true 
con := con U {f} 
and sends signed r,i to Pi. 
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• else: T sets 

ahortedr,i ■= sign^(c, r, i, aborted) 
abort set := abortset U {abortedr^i} 
con := con U {i} 
and sends abortedr,i to Pi. 

— Rule T3: (A decision signed is always preserved.) If i ^ con and signed = 
true then T sets 

signed.^ i := firstsigned 
con := con U {*} 
and sends signed.^ i to Pi. 

Protocol “verify”: 

If Pi wants to show a signed contract on c to verifier V it sends Ci to V. V 
outputs signed if it receives a messages Ci from Pi such that 

— Rule VI: Ci = (sign]^(c, t + 2, prev_rnd_ok), . . . , sign„(c, t + 2, prev_rnd_ok)), 
or 

- Rule V2: Ci = ((2, j, signj(Xij, resolve), Xij, Mqj), signy(c, 2, signed)), 
for some j, Xij = (sign^(c, 1, prev_rnd_ok), . . . , sign„(c, 1, prev_rnd_ok)) and 
Mqj = nil, or 

- Rule V3: Ci = ((r, j,sigrij(Xr-ij, resolve), Xr-ij, Mr- 2 j), 

sign-p(c, r,j, signed)), for some r > 2, some j, Xr-ij = 

(sigrij^(Mr- 2 ,j,r — 1, vec_ok), . . . ,sign„^(M^_ 2 j, r — l,vec_ok)) and 

Mr- 2 , j = (signi(c, r — 2, prev_rnd_ok), . . . , sign„(c, r — 2, prev_rnd_ok)), 

and stops. On input wakeup V outputs verify Jailed and stops. 

Lemma 1. Consider Protocol “sign” of Scheme Wi If T receives resolver, i and 
finds aborteds,k G abortset for an s < r — 2 then P^ is dishonest. 

Proof. Assume T finds abortedg,k G abortset with s < r — 2. Since s > 1 we 
have r > 3, and therefore resolvCr,i includes signj,(Mr_ 2 ,fe, r, vec_ok), taken from 
mr-i,k. Thus Pk participated in Round r — 1, and since r — 1 > s this means that 
Pk was still active after having sent resolvCs,k to T. Thus Pk is dishonest. □ 

Theorem 1 (Security of Scheme ITjl. SchemeUiis an asynchronous MPCS 
with third party T for any t < n. It is optimistic on agreement and terminates 
in t + 2 rounds if T is not involved, and m t + 4 rounds in the worst case. 

Proof. (Sketch) 

Correct execution, termination of verify and optimistic on agreement are all 
obviously satisfied. 

Unforgeability of contract. All variants of a valid contract contain some pieces 
signed by all signatories, and these signatures exist only if all signatories started 
the protocol. 

Verifiability. The definitions of Ci in the signing protocol for Pi satisfy the 
conditions checked by V. 

No surprises with invalid contracts. Assume an honest Pi started the protocol, 
decided failed, and some honest verifier V decides signed. 
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— Assume V decided because of Rule VI. This implies that Pi sent sigrij(c, t-l- 

2, prev_rnd_ok) as part of rrit+ 2 ,i- Thus Pi asked T in Round t+2 and received 
ahortedt^ 2 ,i- According to T’s rules this means that there is some signatory 
Pkt^^ that received ahortedt+iM+i - Inductively we can show (using T2) that 
for all rounds s G there is one signatory Pk^ that received 

aborted s^k^- Because of Lemma ^ we know that all signatories Pk^ with s G 

t} are dishonest, and as there are at most t dishonest signatories 
we know that Pkt+i must be honest. Since Pkt+i asked T in Round t + 1 
it did not send its message for Round t + 2. Thus no signatory receives all 
information that is necessary to construct a signed contract. This contradicts 
our assumption. 

— Assume V decides because of Rule V2 or Rule V3, seeing a message resolver j- 
Thus Pi asked T in some Round s with s < r and received aborted s^i- As 
Pi is honest we know that s > r — 1 (Lemma CJ) . But this contradicts Rule 
T2 of T, i.e., if Pi received abortedg.i then sigrij.(c, r, i, signed) cannot exist. 
Again we have a contradiction. 

Termination o/sign[]. For each signatory the protocol proceeds in t+2 rounds, 
and each round terminates, either because the full vector of n messages arrived 
that allows to enter the next round, or because T is asked and will eventually 
send an answer (which results in a total number of t -I- 4 rounds if T is asked in 
the last round). □ 

Remark 2. The protocol is optimistic on agreement only: If Pi starts the protocol 
but Pk does not then Pi will send resolvei^t to T. In [HI we show that any such 
protocol can be transformed into a protocol that is optimistic in all cases. 

Number of Messages and Rounds. Let Cg and Cb be the costs of a single and a 
broadcast message, respectively. 

In the optimistic case. Protocol “sign” of Scheme Q runs in t -|- 2 rounds 
where each signatory broadcasts one message to all other signatories, resulting 
in costs of {t -I- 2)nCb. In the worst case each signatory might have one message 
exchange with T, resulting in t-|-4 rounds and costs of {t + 2)nCb + 2nCs. If one 
assumes that each broadcast requires n — 1 single messages we end up with costs 
of {{t + 2)n{n — 1) + 2n)Cs = 0{tn^)Cs. All broadcast messages of Scheme ^ 
have length O(logn), all messages sent to or by T have length O(n'); thus we 
need 0(tn^ log n) bits only. 

Alternatively we can replace each round of n simultaneous broadcasts by 2 
rounds and 2(n — 1) single messages. This will result in 2t -|- 6 rounds and costs 
of ((t + 2)2(n—l)+2n)Cs = 0{tn)Cg, and still O(fn^logn) bits: Each broadcast 
round, say. Round r, is implemented by two mini-rounds: in Mini-Round r\ each 
signatory Pi with i > 1 sends its message to P\ only. Pi collects the full vector 
of n messages, as it would do in the original protocol. If this succeeds it sends 
this vector to all other signatories in Mini-Round r 2 , which concludes Round r. 
As before, everybody, in particular Pi, can stop waiting any time and can send 
a message to T. This modification does not affect our proof of security, since 
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only in the all-honest case messages sent between honest signatories have to be 
delivered eventually. 

According to ini any asynchronous optimistic multi-party contract signing 
protocol tolerating t = n — 1 requires a number of rounds at least linear in n. 

Corollary 1 (Round optimality). 

The number of rounds in Protocol “sign” of Scheme^is asymptotically opti- 
mal for t = n — 1 . 

5 Abuse-Free Asynchronous Multi-party Contract 
Signing 

Schemed is not abuse- free: Consider n = 2 and t = 1, and assume that P 2 is 
dishonest. Messages mip, mi^2 and m2,i are delivered. Now the adversary fully 
controls the result: If he ignores i and m2p and delivers resolvei^2 to honest 
T then T will decide failed. If he delivers resolve ^^2 to T then T will decide signed. 
The message resolve ^^2 convinces any outside party that T would actually decide 
signed. 

One cannot avoid that the adversary has an advantage in determining 
whether the protocol ends with signed or with failed, provided all signatories 
start the protocol. But one can avoid that the adversary can prove his ability to 
force the result signed to an outside party. 

In |i 1 3 j this is achieved by introducing and constructing a new type of 
digital signature scheme, called private contract signatures. 

Our solution is conceptually much simpler, as it uses standard cryptographic 
primitives only: The basic idea is to use Schemed as it is, but let each signatory 
Pi generate a new, fresh pair of secret and public signature key, {ski, pki), for 
each execution of Protocol “sign” of Scheme d We call the result of such an 
execution a pre-contract. As long as the adversary cannot prove that a certain 
fresh key belongs to a certain signatory he cannot prove to an outside party the 
status of the protocol, and hence the protocol is abuse-free. 

We call the fresh keys {ski, ph) short-term keys, in contrast to the long-term 
keys used by Pi to generate signatures under its name. Note that by convention 
(Sect, d all the public long-term keys are fixed in advance! Let SIGNi(a:) denote 
Pi’s signature with its long-term key, and let sigrij(a;) denote Pfs signature with 
its short-term key ski. 

Our construction makes use of a chosen-ciphertext secure public-key encryp- 
tion scheme IM . Only T needs to decrypt messages; let ET{r',x) denote the 
encryption of x with T’s public key using random string r. We assume that each 
party knows T’s public encryption key. 

Scheme 2 (Asynchronous Optimistic Abuse-free MFCS) 

Protocol “sign” for honest Pp. 

— Phase 1, Distribution of encrypted certificates: On input 

{sign, tid, contr) signatory Pi generates {ski, pki) and computes certi := 
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S\GNi{i, pki,tid, contr) and ecerti := ET{ri] cerU) for a randomly chosen 
Ti. Pi distributes {pki, ecerti, tid, contr) to all signatories. 

Each Pi collects the list of all n tuples. It can stop waiting for a tuple any 
time, in which case it outputs failed and stops. 

— Phase 2, Pre-contract signing: Protocol “sign” of Scheme d is ex- 
ecuted to obtain the pre-contract, using the n fresh keys {pki, ski). 
The pre-contract to be signed is the concatenation of tid, contr, and 
((pfci, ecerti), ..., {pk„, ecert„))0 

— Phase 3, Conversion of pre-contract into real contract: If Pi gets 
output signed for the pre-contract then Pi broadcasts {ri, cerU) to all other 
signatories. (This allows any other party to check that ecerti was an encryp- 
tion of certi.) 

If Pi receives input wakeup before having received all n pairs, or if not all 
pairs match (i.e., there is some j such that ecertj yf Exinj; certj), or certj is 
not a valid signature by Pj on {j,pkj, tid, contr)) then Pi shows the signed 
pre-contract to T. 

— Phase 4, Recovery from failed conversion: 

• If T gets a signed pre-contract it extracts the list {{pki, ecerti), , 
{pkn, ecertn)) and tries to decrypt all certificates. 

If this fails, or if some of the decrypted certificates are not valid or not 
of the proper format, then T signs a “failed” message and sends it back. 
Otherwise T signs {tid, contr, certi, . . . , cert„) and sends all certificates 
and P’s signature back. (T’s signature certifies that ecertj was an en- 
cryption of certj for each j. Note that usually T does not know rj.) 

• If Pi receives an answer from T it verifies T’s signature. If T decided failed 
then Pi decides failed and stops. Otherwise Pi checks the consistency of 
the data received from T, and if this succeeds decides signed and stops. 
If not it decides failed and stops. 

Protocol “verify”: 

If Pi wants to show a signed contract to verifier V it sends to V the pre- 
contract from Phase 2, signed relative to short-term keys pki, . . . ,pkn and (a) 
all n pairs (r^, certi) or (b) the vector {tid, contr, certi, ■ ■ ■ , certn), signed by T. 
V outputs signed if all information matches, and stops. 

On input wakeup V outputs verify Jailed and stops. 



Theorem 2 (Security of Scheme E)l. 

Scheme]^ is an asynchronous MPCS with third party T for any t < n. It is 
optimistic in all cases and terminates in t + 4 rounds if T is not involved, and 
in t + & rounds in the worst case. 

Proof. (Sketch) We prove only three properties: 

® Note that this pre-contract is not the real contract yet, and does not prove anything 
to an outside party: we will show that the adversary can generate valid looking 
pre-contracts himself, without any interaction with the honest signatories. 
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“Correct execution:” In the all-honest case the channels between signatories 
ensure eventual delivery and authenticity (Sect. 0. Thus the adversary cannot 
disturb the protocol, and it terminates with decision signed. 

“Unforgeability:” Assume some honest Pi did not want to sign, but a verifer 
V outputs signed. Th. ^ implies that pre-contracts are unforgeable relative to 
the short-term keys, and Protocol “verify” requires that one of the short-term 
keys was certified by Pi, which contradicts our assumption. (Note that the set 
of signatories is fixed in advance; see Sect.|21) 

“No surprises with invalid contracts:” Assume an honest Pi started the pro- 
tocol, decided failed, and some honest verifier V decides signed. 

Th.Einiplies that there was a run of Protocol “sign” of Scheme Ethat yielded 
a signed pre-contract. Pi participated in this run as one of the short-term keys 
used was certified by Pi relative to the original (tid, contr), and Pi certifies only 
fresh keys. Therefore Pi obtained a signed pre-contract as well. As Pi finally 
decided failed it must have obtained “failed” from T in Phase 4, which means 
that T was not able to successfully decrypt and check all certificates. 

Since T does not distribute inconsistent decisions V accepted the contract 
because of the result of a successful Phase 3. Thus there must be n pairs (r^, certj) 
that are valid certificates and consistent with the signed contract. But T was 
not able to obtain the corresponding n certificates certi. Since T is honest this 
contradicts the correctness requirement of the encryption scheme. □ 

Remark 3. Phase 1 does not guarantee that ecerti actually contains a valid cer- 
tificate certi, i.e., it might happen that there are invalid certificates but Phase 
2 is started and terminates successfully. 

At first glance this might look dangerous, but actually it is not: If at least one 
ecerti is invalid — i.e., decryption does not yield a certificate certi on pki signed 
by Pi — then no party will obtain the real contract. 

Theorem 3 (Abuse freeness). Scheme^is abuse- free. 

Proof. (Sketch) We prove this property by contradiction: Assume that at a cer- 
tain point in time the adversary has full control over the contract and can con- 
vince an outside party that he can enforce result signed. 

Since the adversary has full control still both results are possible. There- 
fore no party can show a valid pre-contract, as the information contained in it 
would determine the result of the contract signing. This follows from Th. 0 If 
one signatory had a signed pre-contract then any other signatory would obtain 
it as well, and T could unambiguously decide by evaluating this pre-contract. 
Therefore no honest signatory will enter Phase 3, i.e., no honest signatory will 
open any encryption. 

In this situation we can simulate the adversary’s view of the protocol: The 
simulator generates fresh keys {ski, pki), exactly like the honest signatories did. 
But instead of encrypting certificates certi on these keys, the simulator encrypts 
random data of the same length. Since the encryption scheme is secure against 
adaptive chosen ciphertext attacks the resulting ciphertexts ecerti are indistin- 
guishable from encrypted certificates. This simulation works as long as none of 



Round-Optimal and Abuse-Free Optimistic Multi-party Contract Signing 535 



the ciphertexts must be opened — which is the case as no valid pre-contract ex- 
ists, yet. Thus, any interaction between the adversary and an outside party can 
be simulated without any interaction with the honest signatories. Hence, the 
outside party gets no information about whether the honest signatories actually 
want to sign a contract, which contradicts our assumption. □ 

Acknowledgments: We thank Juan Garay, Birgit Pfitzmann, Matthias 

Schunter and Michael Steiner for interesting discussions. 
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Abstract. We prove two results on commutation of languages. First, 
we show that the maximal language commuting with a three element 
language, i.e. its centralizer, is rational, thus giving an affirmative answer 
to a special case of a problem proposed by Conway in 1971. Second, we 
characterize all languages commuting with a three element code. The 
characterization is similar to the one proved by Bergman for polynomials 
over noncommuting variables, cf. Bergman, 1969 and Lothaire, 2000: A 
language commutes with a three element code X if and only if it is a 
union of powers of X. 



1 Introduction 

Very little, or in fact almost nothing, seems to be known on solutions of language 
equations, the exception being very special equations with two operations char- 
acterizing rational languages, cf. |7| and m for some extensions. Even the most 
basic equation, namely the commutation XY = YX, is poorly understood. On 
the other hand, it proposes several natural and apparently very difficult combi- 
natorial problems. 

It was almost 30 years ago when Conway proposed such a problem, asking 
whether the maximal set commuting with a given rational set is rational, see jS|. 
The problem remained unanswered up-to-date, even for finite sets. Even worse, it 
seems to be unknown whether the centralizer of a finite set is recursive, or even 
recursively enumerable. A related problem asking whether any decomposable 
rational language L, i.e. a rational language having the decomposition L = XY 
for some languages X,Y ^ {1}, is decomposable via rational languages, is much 
simpler, as shown in jSj, cf. also 0, dSI, and 0. 

Another related problem is to search for a characterization of all languages 
commuting with a given rational or finite set. In the case of multisets, i.e. polyno- 
mials over noncommuting variables and with rational coefficients, this problem 
has an elegant solution due to Bergman 0: Two polynomials p{x) and q{x) 
commute if and only if they are linear combinations of powers of a common 
polynomial t(x). A similar result holds also for noncommutative formal power 
series, see 0. 

* The authors acknowledge the support from the Academy of Finland under project 
44087. 
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Recently, both of the above problems have been solved for two element sets 
in In this case, Conway’s problem has an affirmative answer and moreover, 
the binary sets possess a Bergman type of characterization: Any set commuting 
with a two element set A is a union of powers of X (or just a union of powers of a 
primitive word t, if A C t*, for some word t). On the other hand, as was pointed 
out also in 0, no similar characterization can be achieved for four element sets, 
in general. 

These problems were also considered in the case of codes, in PI- They have 
been completely and affirmatively answered if A is a prefix code, i.e. no word 
is a prefix of another. Moreover, it was proved that for a prefix code A, its 
centralizer is always A*, and the same was conjectured to be true for all codes. 
This, however, remains an unsolved - and difficult - problem. 

In this paper, we continue the study of these two problems, considering a 
special case of three element sets. We answer to Conway’s problem affirmatively 
in this case, and show that Bergman type of characterization, conjectured in PI, 
holds for three element codes. Our new idea of considering these problems as 
equations on languages, combined with the techniques of 0, PI, and PI, gives 
a new insight on the problem. What is encouraging is that we do not see any 
reason why this approach cannot be extended to more general cases. We also 
point out that in general, the centralizer of any finite set is in Co-RE. 

The paper is organized as follows. In Section 2, we fix the terminology and 
discuss the background of these problems. Several basic results needed in later 
considerations, as well as two general results on the centralizer of a finite lan- 
guage, are proved in Section 3. Section 4 is devoted to a solution of Conway’s 
problem for three element sets and in Section 5 we prove the characterization 
for languages commuting with a three element code. Several open problems are 
proposed in Section 6. Due to space constrains, the proofs of some results are 
only sketched here. They can be found in full length in 

2 Preliminaries and Background 

In this section we fix our terminology, and recall several known results related 
to this work. For further details in Combinatorics on Words, we refer to 0. 

For two words u,v G S* , we say that u is a prefix of u if u = uw, for some 

w G E*, and write u < v, and u = vw~^; u is a proper prefix of v if both u and w 

are nonempty words. We say that u is a suffix of u if u = wu, for some w G E*, 
in which case we write u = w~^v. These notations extend in a natural way to 
languages: for two languages Ti,T 2 C E*, we denote 

Li ^ L 2 = {ui ^U2 I ui £ Li,U2 £ L 2 }, 

L1L2 ^ = {U1U2 ^ \ Ui G Li,U 2 G ^ 2 }- 

For a finite language F we define the two parameters 

/p = min|u|, Lp = maxlul, 

uGF uGF 
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where |u| denotes the length of the word u. We say that F is periodic if there is 
a word u such that F C u*. 

It is an elementary property on commutation of languages that for any subset 
L C E*, there is a unique maximal language commuting with it and, moreover, 
it can be easily proved that it is a monoid. We will call it the centralizer of L 
and denote it as C(L). Equivalently, one can define the centralizer of L as the 
union of all sets commuting with L. 

We say that a language L C E* is a, code if the monoid L* is free. Equivalently, 
L is a code iff any equality 

X 1 X 2 ...Xm = yiJ /2 ■■■Vn, m, n > 0, Xi, yj G L 

implies n = m and Xi = yi^ for all 1 < i < m. 

Let 17 be a finite alphabet, and E a set of unknowns in one-to-one corre- 
spondence with a set of nonempty words X C E*, say Xi, for some fixed 
enumeration of X. A (constant-free) equation over E with E as the set of un- 
knowns is a pair (u,u) G E* x E* , usually written as u = v. The subset X 
satisfies the equation u = v ii the morphism h : E* — >■ E* , h(^i) = Xj, for all 
* > 0, is such that h{u) = h(v). These notions extend in a natural way to systems 
of equations. 

We define the dependence graph of a system of equations S, as the nondirected 
graph G, whose vertices are the elements of E, and whose edges are the pairs 
(^i, ^j) € E X E, with and appearing as the first letters of the left and right 
handsides of some equation of S, respectively. 

The following basic result on combinatorics of words, cf. P], is very useful 
and efficient in our later considerations. 

Lemma 1 (Graph Lemma) Let S he a system and let X C E~^ be a subset 
satisfying it. Lf the dependence graph of S has p connected components, then 
there exists a subset F of cardinality p such that X C F* . 

Note that in Graph Lemma it is crucial that all words are nonempty. 

Concerning the commutation of languages, we will be interested in the fol- 
lowing two problems, cf. 0 and ca, respectively: 

Conway’s Problem: Is the centralizer of a rational language, rational? 

BTC-Problem: For a given finite set A C A*, is it true that for any set Y 
commuting with X, there exists a set V C 17+ and sets I, J of nonnegative 
integer indices, such that 



Note that if X and Y satisfy dQl, then they commute. The statement of the 
BTC-problem is the same as the one proved by Bergman in P] to characterize the 
commutation of two polynomials over noncommuting variables. The abbreviation 
BTC comes from there: Bergman Type Characterization. 




and 




j&J 
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First results on Conway’s problem were achieved in d, where it has been 
proved that the answer is affirmative for all prefix codes. Recently, in ^ , the same 
was achieved for all binary sets. The BTC-Problem was also solved affirmative 
in these two cases in the same two papers, respectively. Moreover, it was shown 
in ^ that the BTC-Problem does not have a positive answer in general, or even 
in the case of four element arbitrary sets. A simple counterexample is the set 
X = {a, ab, ba, bb}, which commutes with X U X^ U {bab, bbb}. 

3 Auxiliary Results 

In this section we prove some lemmata needed in our later considerations, as 
well as give some general properties of the centralizer. 

In both of the following lemmata, we will consider the maximal languages 
with respect to component- wise inclusion, satisfying some given relations. In 
each case, their uniqueness is clear. 

Lemma 2 Let F2, F^, and G be finite languages. If the relation 



Fi + F2XG = F 3 + YG, (2) 

is satisfiable, then the maximal solution (A, Y) is sueh that Y = F2X + H , for 
some finite language H . 

Proof. Clearly, F2X C P, since Y is maximal and we could add to Y the elements 
of F2X preserving the relation 0 . 

On the other hand, consider w G G, and y £Y, with \y\ > max{L , L p^) . 
Then, yw gYG, and hence, yw € F2XG: 

yw = uxv, u€F2,x€X,vGG, 

which implies that y = uyi, for some yi € S*. We can add yi to X (and hence, 
y to F2X) and F 2 ?/i to Y, preserving the equality. Hence Y = F2X -|- H, where 
H^FiG~\ 

It is also useful to note that if the equation 0 has solutions, then F 3 C 
Fi + YG, and Fi = F[G + P", with F” C F 3 . In other words, the maximal 
languages X and Y satisfying 0 actually satisfy F[G + F2XG = YG, too. ■ 



Lemma 3 Let F\,F2,G, and H be finite languages. If the language equations 
FiX = XF 2 , G+FiY = H + YF 2 , 

are satisfiable, then their maximal solutions X and Y are such that Y = X -\- A 
for some finite language A. 

Proof. Consider the equations u~^{G + FiY) = u~^{H + YF2), for all u G X*, 
with length |u| = 1 -I- max(L (5 -|- Lp). Since the word u is longer than the words 
of G and H, we obtain the system u~^{FiY) = u~^{YF2). But then, the same 
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system is satisfied also by X. Since the maximal solution of a given linear system 
of equations can be readily seen to be unique, we obtain that u~^X = u~^Y, for 
all u such that |u| = 1 + ma,x{Lo, Lh)- Furthermore, clearly X C Y, since we 
can add X to Y, and the relation satisfied by Y is still valid. Hence, Y = X + A, 
for some finite language A. It is useful to observe that A = ■ ■ 

When computing the centralizer of a language, we will always assume that 
the language does not contain the empty word since otherwise Conway’s problem 
is trivial. Indeed, the centralizer of such a language is always E*, as noticed also 
in Also remark that in this case the BTC-problem has a negative answer; to 
see this, it is enough to consider any language L such that L* ^ E* , and notice 
that L' = L U {1} commutes with A*, but they are not unions of powers of a 
third language. 

In general, very little is known about the centralizer of a finite language. 
Even much weaker questions than Conway’s question seem to be unanswered, 
namely it is not known whether a centralizer of a finite language is recursive or 
even recursively enumerable. We are able to show only that its complement is 
always recursively enumerable. 

Theorem 4 For any finite set, the complement of its centralizer is a recursively 
enumerable language. 

4 Conway’s Conjecture for Three Element Sets 

We consider now Conway’s problem for finite languages. To start with, let 

FX = XF, (3) 

be the language equation for a given finite language F. Obviously, the language 
F can be uniquely written as 

F = U\Fi + U2F2 + • • • + UnFn, ( 4 ) 

such that the following conditions are satisfied: 

(i) 1 S Fi, for alH = 1, . . . , n. 

(ii) Ui ^ Pref(ttj), for any i j. 

We will say that 0) is the prefix decomposition of F. 

Example 1. The prefix decomposition oi F = {a, aa, b, bab} is F = a • {1, a} + 6 • 
{l,ab}. 

We say that two nonempty words are incomparable if neither of them is a 
proper prefix of the other. 

Using this notion to refine the equation FX = XF, we can prove that the 
centralizer of a finite language is rational, provided that one additional condition 
is satisfied: there is a word in F such that it is incomparable with the other words 
of F (for example, all prefix languages satisfy this condition). Using this result, 
we then prove Conway’s conjecture for three element sets. 
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Theorem 5 If F is a finite language containing a word which is incomparable 
with the other words of F , then the centralizer of F is rational and effectively 
computable. 

Proof. Consider the prefix decomposition of F: F = wiFi + • • • + and let 

C(F) be its centralizer. Then clearly, C(F) = uiXi + • • • + UnXn + Xq, with 

Xo = {x€ e(F) I |x| < Lf}, 

since the equation m implies that all long enough words of C(F) have as a prefix 
an element of F. Then, the equation o implies that 

F,e{F) = XiF + u~^{XoF), for all 1 < i < n. (5) 

Now, the hypothesis implies that one of the sets Fi, say Fi, contains only the 
empty word. We then obtain from the system @ that 

F,XiF + F,{uf\XoF)) = X,F + u~\XoF), for all 2 < f < n. 

By Lemma 13 this means that F^Xi + = Xi, for all 2 < * < n, and for 

some finite sets Hi,. .. , H^. So, 

e(F) = Xo + UiXi + U 2 {H 2 + F 2 X 1 ) + • • • + Un{Hr, + F„Xi) 

= Xq + U 2 H 2 + ■ ■ • + UnHn + {u\ + U 2 F 2 + ’ ' • + U„F„)Xi 
= H + FXi, 

and this, together with the first equation of the system further implies that 
uf\XQF) + XiF = H + FXi, 

where H = Xq + Obviously, X\ is maximal since C(F) is so. By 

Lemma|3 we then obtain that X\ = A + C(F), for some finite set A, and finally, 

e(F) = H + F(e(F) + A) = H + FA + Fe(F), 

i.e., C(F) is rational. Moreover, C(F) is of the form C(F) = F*G, with G a 
finite language. The above equations can be further refined to prove that Lq < 
SLp — 2lp. Consequently, for a given F, one can effectively find G and hence, 
also C(F). The theorem is thus proved. ■ 

It might be possible to refine this method for the general case. For this, we 
need to solve the system o in the case when all Ffs are different from {!}. 
One way to do this is to consider the prefix decomposition of the finite sets 
appearing as coefficients in the left hand side of OSJ and refine the right hand 
side accordingly. It seems that in this way, we either cycle (case which we can 
solve as in the proof of the next theorem), or we eventually obtain a case similar 
to the one in the previous theorem. However, what we can do now is to settle 
completely the three element case, as shown in the next theorem. 
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Theorem 6 The centralizer of a three element set is rational and effectively 
computable. 

Proof. By Theorem El the only case to be considered is that of a finite set F of 
the form F = {u,uv,uw}. In this case, G{F) = uX\ + Xq, with Xq containing 
only prefixes of u. The equation m can now be rewritten as (1 + w + w)G{F) = 
XiF + u~^{XqF), and further in the form 

{u + VU + wu)Xi + (u~^F)Xq = XiF + u~^{XqF). 

We repeat the same reasoning with Xi by taking the prefix decomposition of 
the set {u,vu,wu}. If the words of F are not powers of a same word (which is 
solved in E] and H3I), then after a finite number of steps we obtain an equation 
of the form 

GXn + Ftl = X„F + 

where all are finite languages, and G is a three word language such 

that at least two of its words start differently. Moreover, we have that C(F) = 
u^Xn + Xq, with Xq finite. 

We consider now the equation GY = YF, and we can prove that its maximal 
solution is rational, using the same techniques as in the proof of Theorem E We 
then use LemmaElto obtain that = Y+A, for some finite language A. Hence, 
Xn is also rational. Then the centralizer of F is itself rational, which concludes 
the proof. ■ 

We can also specify the form of the centralizer of a three element set. Similarly 
as in the proof of the Theorem El one can prove that = G*A + B, with A 
and B finite languages. Then, X = u^G*A + B' , and knowing that G has been 
obtained from F by shifting n u-s from its left to the right, we obtain that 
X = F* A' + B', with JY and B' finite languages. 

5 The Solution of the BTC-Problem for Three Element 
Codes 

In this section we characterize all sets commuting with a given three element 
code. The characterization resembles that of Bergman for polynomials over non- 
commuting variables. Namely, we prove that any set of words commuting with 
a three element code X is a union of powers of X. The same condition holds for 
singletons, and also for two word languages, as proved in On the other hand, 
this is not valid anymore for four element sets, as we already mentioned. 

We first recall a result proved in 

Lemma 7 Let L be a language and C(L) its centralizer. For any x £ L and 
z £ G(L), there exists an infinite word u £ G{L)^ such that zx^ = u. 

We will also need the following lemma, proved in El and Hi: 
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Lemma 8 Let X C S* he a code such that its centralizer is X* , and let Y C S* 
be a language commuting with X. IfY flX" ^ 0, for some n > 0, then X" C Y. 

The first step towards our goal is to prove that the centralizer of a three word 
code F is F*. The next lemma will be the main tool to achieve this, and the 
most difficult part of this paper. 

Lemma 9 Let F he a nonperiodic three word set such that none of its elements 
is a prefix of the other two, and let C{F) be its centralizer. Then, all words of 
G{F) — {1} have as a prefix a word from F. 

Proof. Let F be the language {u,v,w} satisfying the conditions of the lemma. 
As we already observed, all long enough words of C(F) have as a prefix a word 
from F. Let us consider the set of those “short words” of C(F) which do not 
have as a prefix a word from F. The claim of the lemma is that this set, say Xq, 
contains only the empty word. Obviously, 1 G Xq, so let us assume that there is 
a nonempty word x G Xq . By Lemma Q we have that 

xu^ = a\a2 . . .an . ■ . 

xv^ = I3if32 ■ ■ ■ Pn ■ ■ ■ ( 6 ) 

xw^ = 7 i 72 . . . 7 « ■ ■ • 

for some G F, i>l. Let us denote A = {ai,/3i,7i}. 

If the cardinal of A is 3, that is to say, A = F, then by Graph Lemma on the 
system ®, we conclude that F is periodic: a contradiction. If A is a singleton, 
e.g. A = {u}, then, as x is from Xq, we have that a; is a prefix of u: u = xt, 
with t fy 1. Hence, we conclude again by Graph Lemma on the set of unknowns 
{t,u,v,w} of JS|): T’ must be periodic. 

Assume now that A has cardinality 2. In this case, one can prove that Xq is 
totally ordered by the prefix relation. If in the system a\ = u or f3\ = u, 
then u = xt, for some t fy 1, and we can conclude again by Graph Lemma. In 
the remaining the case when a\ = Pi = v, i.e. 



xu = vyi, XV = vy2, xw = ui/q. 



for some yi, ?/ 2 , 2/3 G G{F), the proof strategy is to distinguish two cases: 2/2 G Xq 
(obtaining 2/2 = x), or 2/2 ^ -^0 (obtaining w < 2 / 2 )) and aiming to apply the 
Graph Lemma to conclude that F must be periodic. Due to space limitations, 
we skip this part of the proof, referring to P] for the complete proof. ■ 

We can prove now that F* is the maximal set commuting with F, for any 
three word code. 

Theorem 10 The centralizer of a three word code F is F* . 

Proof. Let C(F) be the centralizer of F. We distinguish three cases, depending 
on the prefix decomposition of F, as defined in Section 0 
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I. F = ui + U2 + W3, and none of the words of -F is a prefix of another one. 
Then, using Lemma| 3 , we have that C(F) = U\Xi + U2X2 + U3X3 + 1 . From the 
equation FC{F) = C{F)F, we obtain the following equations: 

e{F)=XiF+l, e{F)=X2F+l, e{F)=X3F+l, 

implying that Xi = X2 = X3, and hence, C(F) = FXi + 1. But then, the system 
is equivalent to FXi + 1 = X\F + 1, which, in turn, is equivalent to FX\ = XiF. 
Thus, Xi = C(F) and furthermore, C(F) = FC{F) + 1 , which, as is well-known, 
see [Z], implies the claim. 

II. F = m (1 -|- u) -|- w, and u and w are not prefixes of one another. Similarly as 
above, we obtain that C(F) = uXi -|- IUX2 + 1, and then, the equations 

{l + v)e{F) = XiF+l + v, e{F) = X2F+l. 

This implies that {l+v)X2F+l+v = XiF-|-l-l-r’, and so, {l+v)X2 = Xi. Indeed, 
we can add XxF to the left hand side, or (1 -|- v)X2F to the right hand side, 
preserving the equality, and then we use the maximality. Hence, C(F) = FX2 + I, 
and the above system is equivalent now to FX2 -I- 1 = X2F + 1 . We conclude as 
in the previous case. 

III. F = u\ + U2 + U3, with ui a prefix of both U2 and U3. Equivalently, F is of 
the form F = {u,uv,uw}. Assume that F* is a proper subset of 6(F), and let 
X be minimal with respect to the length in C(F) — F*. 

Claim 1: xu = ux. 

Proof of claim 1. Assume the contrary and set x\ = x. Then we define the 
sequence {xi)i<„, for some n > 1, such that XiU = uxi+i, for all 1 < i < n — 1, 
and either x„u = uXi, with 1 < i < n, or XnU = ty, with t € {uv^uw}, and 
y G C(F). The first case is impossible since we obtain that X\ = ■ ■ ■ = Xn, and 
hence, xu = ux. In the second case we obtain that \y\ < \x\ and thus, y S F*. 
The conclusion is that xu^ = u^~^ty S F*. Similarly we can prove that there 
is m > 1 such that u'^x G F*. But F is a code, i.e. F* is free, and then, 
Schiitzenberger’s criterium of a free monoid implies that x G F*, cf. m- This 
is impossible and the claim is thus proved. 

As a consequence of the claim we obtain that x is the only minimal element 
in C(F) — F*. Let us now assume that |u| < |w|. 

It is important to observe that neither v, nor w can commute with u since 
F is a code. 

Claim 2: There is a G F* such that xa G F* . 

Proof of claim 2. Assume again the contrary and consider the word xuv. We 
prove first that there exists n > 0 such that 

xvu" = u^wx. ( 7 ) 

If xuv = uvy, for some y G 6(F), we have that |a:| = |j/|, and so, either x = y, 
which is impossible since it implies that uv = vu, or y G F*, and so, xuv G F*, 
which is again a contradiction. 
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If xuv = uwy, we obtain that |?/| < |a:|, and due to our assumption, we must 
have y = X, and xv = wy, which is what we wanted to prove first. 

Finally, if xuv = uy, then y = xv. In this case, we consider the word yi = y, 
and then, the word y\u = xvu G C{F)F. We have yiU = rtj/i+i, for all 1 < i < 
n — 1, for some n > 1, and either ynU = uyi, for some 1 < i < n, or = uvt, 
or ynU = uwt, for some t G G{F). In the first two cases we obtain that uv = urt, 
which is impossible. In the third one we obtain xvu^ = u^wx, for some n > I. 

Let us also consider the word xuw. Similarly as above (or using the symmetry 
as an argument), one can prove that there is m > 0 such that 

xwu^ = u^vx. (8) 

Without loss of generality, we can assume that m < n. If x < u™, then 
= xy, with y ^ 1, yu = uy, and from {3) and (0 we derive that u'^~™'yw = 
vu^~‘^y, yv = wy, and yu = uy. Using Graph Lemma on these three relations, 
on the set of unknowns {y, u, v, w}, we get now a contradiction. 

If u" < X, then x = u^y, with y ^ 1, uy = yu, and we obtain yv = wy, 
u^~^yw = vu^~'^y, yu = uy, and again this gives the commutativity of F. 

Finally, if tt™ < x < u", we have u" = xy, and x = u"^z, with y,z ^ 1, 
yu = uy, and zu = uz. We derive from 0 and (0 that xvu'^ = u^wx = 
{yx)w{u^ z) = y{xwu"^)z = yu^vxz, implying that uv = vu, which is impossi- 
ble. This completes the proof of claim 2. 

Claim 3: There is (3 G F* such that fix G F* . 

Proof of claim 3. This is similar to the proof of claim 2 above. 

Since U is a code, from the claims 2 and 3, using Schiitzenberger’s criterium 
of a free monoid, we obtain that x G F* . This is impossible: x was chosen in 
e(U) -F*. 

The conclusion is that C(F) C F* , and thus, C(U) = F* . ■ 

Now, we are ready for the second main result of this paper. We characterize 
all the sets commuting with a three element code. 

Theorem 11 If F is a three word code, then any set eommuting with F is a 
union of powers of F. 

Proof. This is immediate now, using Theorem II 1)1 a, nd Lemma El ■ 

6 Final Remarks 

We have continued the research on the commutation relation XY = YX for 
languages, initiated in ini, m, and 0. Our results settle some basic problems 
for three element sets, and at the same time give indications that these prob- 
lems are very difficult, in general. Indeed, there remain many challenging open 
problems: 

Problem 1 Does the Bergman type of characterization hold for all three element 
sets? 
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Problem 2 Does the Bergman type of charaeterization hold for all eodes? 



Problem 3 Does the Conway’s problem have an affirmative answer for all finite 
codes? 



Problem 4 Is the centralizer of a code C always C* ? 



Problem 5 For a finite set F we define its root as the set ^(F) with the minimal 
number of elements such that F is a power of H{F) . Does there exist a finite set 
F C U* such that its centralizer is different from IR(F)* and S* ? 



Problem 6 Is the centralizer of a finite (or rational) set always: a) recursively 
enumerable, b)recursive, c)rational? 

By Lemma 1^ an affirmative answer to Problem ^ would imply that for Prob- 
lem El Also note that if the centralizer is defined as the maximal semigroup 
commuting with a given X, then the ProblemElhas an affirmative answer, cf. 
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Abstract. Tree-walking antomata (TWAs) recently received new atten- 
tion in the helds of formal langnages and databases. Towards a better 
understanding of their expressiveness, we characterize them in terms 
of transitive closure logic formulas in normal form. It is conjectured by 
Engelfriet and Hoogeboom that TWAs cannot define all regnlar tree lan- 
guages, or equivalently, all of monadic second-order logic. We prove this 
conjecture for a restricted, but powerful, class of TWAs. In particular, 
we show that 1-bounded TWAs, that is TWAs that are only allowed 
to traverse every edge of the input tree at most once in every direc- 
tion, cannot define all regular languages. We then extend this result to 
a class of TWAs that can simulate first-order logic (FO) and is capable 
of expressing properties not definable in FO extended with regular path 
expressions; the latter logic being a valid abstraction of current query 
languages for XML and semi-structured data. 

Keywords: automata, logic, regular tree languages 



1 Introduction 

Regular tree languages can be defined by means of many equivalent formalisms, 
for instance: (non) deterministic bottom-up and nondeterministic top-down tree 
automata, alternating tree automata, two-way tree automata, homomorphic im- 
ages of local tree languages, and monadic second-order logic However, it 

is not known whether there exists a natural inherently sequential model for rec- 
ognizing the regular tree languages. Of course, by definition, they are recognized 
by bottom-up finite tree automata, but these automata are essentially parallel 
rather than sequential: the control of the automata is at several nodes of the 
input tree simultaneously, rather than at just one. With this aim in mind, En- 
gelfriet, together with his co-workers Bloem, Hoogeboom, and van Best, initiated 
a research program mm studying (extensions of) the tree-walking automata 
(TWAs) originally introduced by Aho and Ullman j5|. The finite control of a 
tree- walking automaton is always at one node of the input tree. Based on the 
label of that node and its child number (which is i if it is the tth child of its 
parent), the automaton changes state and steps to one of the neighboring nodes 
(parent or child). Without the test on the child number such automata cannot 
even search the tree in a systematic way, such as by a pre-order traversal as 
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is shown by Kamimura and Slutzki m However, also with the child test, it is 
conjectured that these automata cannot express all regular tree languages iBcni- 
In this paper, we study the expressiveness of tree-walking automata by charac- 
terizing them in terms of transitive closure logic formulas in normal form and 
prove the above mentioned conjecture for a restricted, but powerful, class of 
tree-walking automata. 

Apart from the above purely theoretical motivation, recently, new interest in 
tree-walking automata emerged from the field of database theory. Indeed, one 
of the major research topics at the moment is the design and study of query 
languages for the manipulation of XML documents or electronic documents in 
general imsi. Such documents are usually modeled by ordered labeled trees or 
graphs, depending on the application at hand. In this research, tree-walking au- 
tomata are used for various purposes and appeared in various forms. Milo, Suciu, 
and Vianu uni, for instance, used a transducer model based on tree- walking au- 
tomata as a formal model for an XML transformer encompassing most current 
XML transformation languages. Briiggeman-Kleinn, Hermann, and Wood |0|, 
proposed to use caterpillar expressions as a pattern language for XML trans- 
formation languages. Interestingly, caterpillar expressions relate to tree-walking 
automata like regular expressions relate to string automata: they are just a 
different, though a lot more user friendly, representation of the same thing. Fur- 
thermore, they conjectured their formalims to be less expressive than the reg- 
ular tree languages. Another, more direct, occurrence of tree-walking automata 
is embodied in the actual XML transformation language XSLT [7j proposed by 
the World Wide Web consortium (W3C) and currently being implemented by 
IBM. In formal language theoretic terms, this query language can be best de- 
scribed as a tree-walking tree transducer 0 . Hence, results on the expressiveness 
of tree- walking automata could give insight in the expressiveness of actual XML 
transformation languages. 

We start by characterizing the expressiveness of (deterministic and nondeter- 
ministic) tree-walking automata in terms of (deterministic and non-determinis- 
tic) transitive closure logic (DTC and TC) formulas in normal form. That is, 
formulas of the form [(D)TC(</3)](e, e), where ip is an FO formula containing 
predicates depths (a::) defining x as a vertex whose depth is a multiple of m; and 
where e refers to the root of the tree under consideration. Our result thus implies 
that any lower bound on (D)TC formulas in normal form is also a lower bound for 
(non)deterministic tree-walking automata. Unfortunately, proving lower bounds 
for the latter logic does not seem much easier than the original problem as 
Ehrenfeucht games for DTC and TC are quite involved. Therefore, we use a 
direct approach for a restricted, but expressive, class of tree-walking automata 
in the hope that these techniques will provide insight for the general case. 

We first show that 1-bounded tree-walking automata, that is tree-walking 
automata that are only allowed to traverse every edge of the input tree at most 
once in every direction, cannot define all regular languages. In particular, we 
obtain that they can not evaluate tree-structured Majority circuits where the 
gates have fan-in greater than 2. Next, we generalize this result to a rather 
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powerful class of tree-walking automata, called r-restricted. These automata are 
rather expressive as they can define all of first-order logic (FO) and are capable 
of expressing some tree languages not definable in FO extended with regular 
path expressions. The latter logic is an abstraction of current query languages 
for semi-structured data and XML ITTTia . and, for instance, cannot define the 
set of trees representing Boolean circuits evaluating to true which can easily 
be defined by r-restricted tree- walking automata (cf. Example 0 . 

We conclude by mentioning some related work. Bargury and Makowsky |3j 
proved an equivalence between transitive closure logic and two-way multihead 
automata operating on grids. Their simulation of automata involves nesting of 
TC operators. Potthoff m showed that the same normal form of TC we use, 
suffices to define all regular string languages, the opposite direction being trivial 
in the string case. Recently, Engelfriet and Hoogeboom jS| showed that tree- 
walking automata with pebbles correspond exactly to TC. Hence, when allowing 
pebbles one can simulate nested TC operators. Fiilop and Maneth HH showed 
that the domains of partial attributed tree transducers correspond to the tree- 
walking automata in universal acceptance mode. 

This article is structured as follows. In Section |2l we define tree- walking 
automata. In Section 0 we prove the logical characterization of tree-walking 
automata in terms of transitive closure logic, and in Section 0 we proof the En- 
gelfriet and Hoogeboom conjecture for two restrictions of tree- walking automata. 

2 Preliminaries 

Trees. A tree domain r over N is a subset of N*, such that if u • i G r, where 
u G N* and i G N, then v G t. Here, N denotes the set of natural numbers. If 
i > 1 then also v ■ {i — 1) G r. The empty sequence, denoted by e, represents the 
root. We call the elements of r vertices. A vertex w is a child of a vertex v (and 
V the parent of w) if vi = w, for some i. A E-tree is a pair t = (dom(<), lab*), 
where dom(t) is a tree domain over N, and lab* is a function from dom(t) to E. 
The arity of a tree is the maximum number of children of its vertices. We only 
consider trees with a fixed arity. The depth of a vertex u, denoted by depth(u) 
is the length of u (interpreted as a string over N*). The distance between two 
vertices u and v, denoted by d{u, v) is defined as the length of the path between 
u and V where we assume that d(u, u) = 0. 

A A-tree t can be naturally viewed as a finite structure over the binary 
relation symbols E and <, and the unary relation symbols (Oa)crei:- E is the 
edge relation and equals the set of pairs (v, v ■ i) for every v,v ■ i G dom(t). The 
relation < specifies the ordering of the children of a node, and equals the set of 
pairs {v ■ i,v ■ j), where i < j and v ■ j G dom(t). For each a, Oa is the set of 
nodes that are labeled with a cr. 

Tree-Walking Automata. Tree-walking automata (TWAs) can be seen as the 
simplest analogon of two-way string automata. A TWA starts its computation 
in an initial state at the root of the input tree. In each step, it moves to a neighbor 
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vertex of the current vertex (or stays at the current vertex) and enters a state. 
The direction of movement and the new state depend only on the current state, 
the symbol at the current vertex, the child number of the current vertex, i.e., 
the relative position of the current vertex in the ordered list of the children of 
its parent, and the number of children of the current vertex. 

More formally, a (fc-ary) TWA is a tuple {S, S, S, sq, F), where S is the set of 
states, S is an alphabet (the set of possible vertex labels), Sq G S' is the initial 
state and F C S is the set of accepting states. The only part where a TWA is 
formally different from a standard string automaton is the transition function S. 
The transition function S of a TWA is the union of the functions <5* and 
where i G {0, . . . , k}. For a deterministic TWA, 

— jroot.i jg function from S x S to {stay, {i, . . . ,4_i} x S, and 

— for each i £ (0, . . . ,k}, <5* is a function from (1, . . . , fc} x S x A to (f, stay,4,i 
, . . . , {i} X S. 

For a nondeterministic TWA the ranges of these functions are the respective 
power sets. If several TWAs are around we write 6m to denote the transition 
function of TWA M and the like. 

A configuration c = [w, s] of a TWA M on a tree t consists of a vertex v ot t 
and a state s of M. The immediate successor configuration c' of a configuration 
[u, s] is defined as follows. 

— It V = e, V has i children and carries the symbol a then 

• c' = [e, s'], if i5‘'°°*’*(s, ct) = (stay, s'), and 

• c' = [j, s'], if 5™°*’'(s, cr) = (4.J, s'). 

— If u = wj, for some j < k, v has i children and carries the symbol cr then 

• c' = [w, s'], if F(j, S, cr) = (t, s'), 

• c' = [u,s'], if S^(j,s,a) = (stay, s'), and 

• c' = [vf,s'], if F(j,s,cr) = (V,s'). 

We write c =^M,t c' to express that c' is the immediate successor configuration 
of c in the computation of M on t. Following standard convention we write 

c =^M t ^ (c =^M t c') to express that c' is the j-th (some) successor configuration 

of c in the computation of M on t. If M and/or t are clear from the context 
we may omit them. A tree t is accepted by M if there is an s G F such that 
[e, So] =^M,t [^> '®]- 

Example 1. We illustrate the above definition by means of an example. In par- 
ticular we define a deterministic tree-walking automaton that accepts all tree- 
structured Boolean circuits of fan-in 2 that evaluate to true. A similar construc- 
tion can be given for each fixed bound on the fan-in. For convenience, we only 
consider circuits of the right format. That is, all inner nodes and the root have 
exactly two children and are labeled with AND or OR. Further, all leaves are 
labeled by 0 or 1 . These circuits are assigned a truth value in the usual way. De- 
fine M as the tuple (S', A, 6, eval, F) with S = (eval, 0, 1, left-child-0, left-child-1}, 
A = (AND, OR, 0, 1}, and F = {!}. The transition function 6 is defined as fol- 
lows: 
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— M starts by walking to the first leaf in the tree: for each cr G {AND, OR} 
and i G (1, 2}, 

i5ro°*’2(eval, cr) = (},i, eval); and 

S^(i,eval, tr) =(},i,eval). 

— When reaching a zero (one) leaf, which is a left child, M moves up with this 
information: 

5°(l,eval, 0) = (t, left-child-0); 

5°(l,eval, 1) = (t, left-child-1); 

— If M enters an inner vertex from a left child it is always in one of the two 
states left-child-0 or left-child- 1. If the current vertex is an AND vertex and 
the state is left-child-0 then the current vertex will evaluate to 0 no matter 
the right subtree. An analogous statement holds for OR-gates and the state 
left-child- 1. The information passed to the parent vertex depends on whether 
the current vertex is itself a left or a right child. If the outcome of the left 
child is not sufficient to determine the value of the current vertex M has to 
enter its right child. Formally, for each i G {1,2}: 

(5^(1, left-child-0, AND) = (f, left-child-0); 

<52(2, left-child-0, AND) = (f,0); 

(J2(l, left-child-1, OR) = (t, left-child-1); 

(J2(2, left-child-1, OR) = (t, 1); 

(52(i, left-child-1, AND) = (}, 2 ,eval); and 

<52 (i, left-child-0, OR) = (}, 2 ,eval). 

— If M enters an inner vertex from a right child it is always in one of the states 
0 or 1. By what we have said before, this state indicates the value of the 
subtree at the current vertex. Hence, it only has to be passed to its parent 
vertex. Consequently, for each i G {0, 1} and cr G {AND, OR}: 

< 52 ( 1 , 0 ,ct) = (t, left-child-0); 

< 52 ( 1 , 1 , ct) = (t, left-child-1); and 

< 52 ( 2 , i,cr) = (t,f). 

— It remains to handle the case of leaves that are right children of their parent. 
They simply have to pass their value to the parent. For each i G {0, 1}: 

(5° (2, eval, i) = (}, i). 

It will be convenient to subsume the overall effect of a TWA on a subtree of a 
tree in a so-called behaviour function. Intuitively, if / is the behaviour function 
of a subtree t then f(s) = s' if and only if in the computation of M which starts 
at the root v of t in state s the parent of v is entered in state s' . Obviously, the 
behaviour function of a subtree t depends on the child number of its root in the 
full tree. 




552 



F. Neven and T. Schwentick 



We define behaviour functions more formally. Let M be a fc-ary deterministic 
TWA and let f be a (at most) fc-ary tree. For i < k, let t{i) denote the tree which 
consists of a root which has the root of t as the z-th child and z — 1 other children 
which are leaves. The behaviour function fM,t,i of M on f as an z-th child maps 
states of M to states of M. It is defined as follows. If s is a state of M then 
= s', if there is a j such that (z, s) (e, s') and j is minimal with 

this property. 

Note that does not depend on the labels of the vertices outside of t. 

Furthermore it does not depend either on the actual embedding of t in a larger 
tree as long as the root of t is an z-th child. 

For non-deterministic TWAs behaviour functions are defined analogously but 
with sets of states as function values. 

Next, we turn to the definition of some classes of restricted TWAs. 

— We call a TWA 1-bounded, if, for all trees t, it traverses each edge of t at 
most once in each direction. 

— We call a TWA M r-restricted if the following holds. For each pair u, v of 
vertices of a tree t such that d{u, v) > r the computation of M on t does 
not contain 4 configurations [zz, Si], [f, S 2 ]) [w, S 3 ], [u, S 4 ] in the given order. 
Intuitively, this means that each path of length more than r is traversed at 
most once in each direction. 

Clearly, each 1-bounded TWA is also r-restricted for every r > 1. Engelfriet 
and Hoogeboom showed that the latter automata can define all of first-order 
logic (FO) 0. Moreover, r-restricted TWAs can even define some tree languages 
not definable in FO extended with regular path expressions El- In Section^ we 
show that r-restricted TWAs cannot define the set of all regular tree languages 
thereby giving an answer to the conjecture of Engelfriet and Hoogeboom for a 
powerful class of TWAs. 

Let, for TO > 0, depths be a unary relation symbol. In the following we will 
consider trees which have additionally the predicate depth,„, for some to. In all 
trees, depths will contain all vertices the depth of which is a multiple of to. For 
a vertex zz of a tree t, its r-sphere S'{u) is the set {r | d(u,v) < r}. For a tuple 
of vertices zz, define Sl{u) as define its r-neighborhood N'{u) 

as the structure t extended with the constants zz restricted to the set 5'‘(zz). 

3 A Logical Characterization of Tree- Walking Automata 

We characterize tree-walking automata by transitive closure logic formulas of 
the form TC[</ 3 (x, z/)](e, e), where ip is an FO formula, which may make use of 
the predicate depths, for some to. We say that such a formula is in normal form. 
Further, 

t ^ TC[(/ 3 (a;,z/)](£:,£) 

iff the pair (e,e) is in the transitive closure of the relation {(u,v) \ t |= ip[u,v]}. 
We use deterministic transitive closure logic formulas (DTC) in an analogously 
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defined normal form to capture deterministic tree-walking automata. In partic- 
ular, 

t ^ DTC[(/?(a;, ?/)](£, e) 

iff the pair (e, e) is in the transitive closure of the relation {(u, u) | t |= ^[u, v] A 
(Vz)((/?[u, z] — >■ z = u)}. The latter expresses that we disregard vertices u that 
have multiple (/^-successors. 

Theorem 2. N ondeterministic tree-walking automata accept precisely the tree 
languages definable by TC formulas in normal form. Deterministic tree-walking 
automata accept precisely the tree languages definable by DTC formulas in nor- 
mal form. 

Proof, (sketch) The simulation of TWA’s by (D)TC formulas in normal form is 
an easy extension of a proof of Potthoff HH| who characterized two-way string 
automata by means of TC formulas in normal form. We omit the details. 

For the other direction, consider the formula TC\ip{x, y)](e, e). By Gaifman’s 
Theorem (see, e.g., [ 12 !), there exists an r such that ip is equivalent to a Boolean 
combination of sentences x r- local formulas ^(x, y). Here, a formula ^(x, y) is 
r-local if for every tree t with vertices u and u, whether t ^ C['^, only depends 
on the isomorphism type of Ni^{u, v). As the rank of trees is fixed, there are only 
finitely many possible isomorphism types. Engelfriet and Hoogeboom showed 
that TWA’s can evaluate FO formulas over trees. A straightforward adaptation 
of their proof shows that TWA’s can also evaluate FO formulas with modulo 
depth predicates (details omitted). Hence, we can assume that M can evaluate 
the sentences % at the beginning of the computation. Further, if d(u, v) < 2r 
then M can check whether t ^ p[u,v\ by inspecting otherwise by first 

inspecting N*{u) and afterwards N*{v). Suppose the automaton arrives at a 
vertex u (with u = e as the first case). First, the automaton nondeterministically 
chooses whether it will go to a vertex v of distance < 2r or to a vertex v of 
distance > 2r. In the first case, it inspects chooses a v such that 

t 1= p[u,v] if this is possible. Otherwise it first computes the type of N^{u), 
moves to a vertex v of distance > 2r and checks that the type of Ni^{v) implies 
that t 1= p[u,v\. Finally, if v is the root, the automaton accepts. Otherwise, it 
proceeds in the same manner. Clearly, the automaton will eventually accept if 
t \= TC[(/)(x, ?/)](£, e). 

The simulation of DTC formulas by deterministic TWAs is more intricate. 
Consider the formula DTC[(/>(x, y)]{s, £). We construct a deterministic TWA over 
k-ary trees accepting exactly the fc-ary trees the above formula defines. As in 
the previous case, the automaton first evaluates all sentences x ™ Gaifman 
normal form of p. Let m be the maximum number of vertices occurring in an 
r-neighborhood of a tree. That is, m := max{|S'‘(u)| | t a tree, u S dom(t)}. 
For each isomorphism type r of r-neighborhoods with one distinguished vertex, 
the automaton additionally computes the number of occurrences of r in t up 
to m -|- 2. Now, given a u, to find a v such that p(u,v) holds, the automaton 
proceeds as follows. Let Yq be the set of all vertices w in N^^iu) such that 



554 



F. Neven and T. Schwentick 



N*{u,w) ^ Lp{x,y). By inspecting the 3r-neighborhood of u, M can compute 

Next, it computes the type of N*{u) and the set T of all types r of r- 
neighborhoods for which the following holds: if N*{w) is of type r and N^{w) and 
N^{u) are disjoint then N*{u,w) ^ (p{x,y). The latter is a fixed finite computa- 
tion which can be encoded into the transition function. We call vertices of a type 
from T good vertices (w.r.t. the current u) and denote the set of good vertices 
in thy Yi. Note that M can deduce 11^1 (up to m -I- 2) from the precomputed 
information. 

The set of good vertices in N^riu) is denoted by Y 2 . By inspecting the 
3r-neighborhood of u again, M computes II 2 I and the relative positions of all 
vertices in Y 2 w.r.t. u. 

Let Y = YoU{Yi—Y 2 ). F is the set of vertices w of t such that N^{u, w) |= 
(p{x,y). M can compute |F| without further moving (note that IF 2 I < m). If |F| 
is different from 1 then M can immediately reject. Let us assume in the following 
that |F| = 1. 

If the unique element u of F is from Fq, M can directly go to v. If, on the 
other hand, Fg = F 2 = 0 then M can move to the unique u G Fi via a DFS 
traversal of the tree. 

The only complicated case is when Fg = 0 and |Fi — F 2 I = 1 but F 2 yf 0. The 
complication arises from the following possibility. M has to traverse the tree to 
find the correct unique good w outside As it does not know in which 

part of the tree w is located its way to w might lead back to u. In that case we 
have to make sure that it does not confuse w with a vertex from F 2 . 

We can assume w.l.o.g. that the root of the tree is not in because 

otherwise M can easily distinguish the vertices of F 2 from the desired vertex. 
Now, M proceeds as follows. It starts a DFS walk through the tree starting 
from u and first inspecting the first subtree of u. Whenever it encounters a new 
vertex z it starts a subcomputation which inspects the 3r-neighborhood of z to 
find out whether there is a good vertex w in N^riz). If such a,w is found then 
M computes the set Z{w) of vertices u' G for which N^^{u') = N^j.{u). 

If Z(w) = 0 or all vertices in Z{w) are behind z in the DFS order then w is 
the desired vertex v and M goes there. Otherwise it proceeds in its DFS walk. 
When the DFS walk finishes at the root (without finding the target vertex) then 
M walks back (reverse DFS) until it reaches u again (easily recognized by the 
isomorphism type of its 3r-neighbourhood) . Then it starts a reverse DFS walk 
from u (going upwards first) analogously to the first DFS walk. 

We have to show that M always finds the correct target vertex. First, we 
show that M never moves to a vertex in F 2 . Let w G F 2 . If M reaches w during 
the inspection of the neighborhood of a vertex z before its DFS walk arrives at 
the root then u G Z{w) and M recognizes that u is in the DFS order before z. 
Hence it does not take w as v. The analogous statement is true if z is found in 
the reverse DFS walk after coming back to u. 

Finally, we show that M indeed reaches a target vertex v from Fi (which then 
is the correct one). Assume wlog that v is encountered as a vertex w relative 
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to a vertex z in the DFS walk behind u. If Z{v) = % then v is easily identified. 
Assume Z{v) yf 0. As n is not in N^^iu) all vertices in Z{y) are behind u in the 
DFS order. Let u' be the first vertex from Z{v) in the DFS order. Then v is 
found not later than at the time u' is assumed as z in the DFS walk. 

4 Weakness of Tree- Walking Automata 

Let k be fixed and let Tk be the set of all fc-ary trees the leaves of which are 
labeled with 0 or 1. For each vertex t; of a tree v G Tj~ we inductively define a 
value 0 or 1 as follows. If z; is a leaf then its value (0 or 1) is determined by its 
label (0 or 1). If z; has i children then v has the value 1 if and only if at least 
I of its children have the value 1. Intuitively, T^. is the set of all tree-structured 
Majority circuits. Let (T^) denote the set of all trees t G Tk for which the 
root gets the value 1 (0). We call a vertex z; of a tree t G Tk a, 1-vertex {0-vertex) 
if it gets the value 1 (0). Analogously, we call the subtree rooted at a 1 -vertex 
(0- vertex) a T subtree {0-subtree). The next lemma follows immediately from 
the inductive definition of the set T^ . 

Lemma 3. For each k, the set T^ is a regular set of trees. 

We have seen in Example E that there is a deterministic TWA which recog- 
nizes Tf (note that Tf U T 2 can be seen as the set of trees representing Boolean 
circuits consisting of only OR-gates) . This was due to the fact that the automa- 
ton, after evaluating a right subtree of a vertex v, could conclude the value of 
the corresponding left subtree of v from the label of v and the fact that it had 
to enter the right subtree. For k > 2, things are more complicated. In fact, we 
conjecture the following. 

Conjecture 4- For k > 2,T^ can not be recognized by a (det. or nondet.) TWA. 

In this section we prove this conjecture for a restricted type of TWAs, 1- 
bounded TWAs. The proof can be easily generalized to the sets T^, for each 
k > 3. Furthermore, it can be extended to show that, for each r there is a related 
regular set of trees which can not be recognized by any r-restricted TWA. 

Before we state and prove the result we introduce some important concepts 
for that proof and show a purely combinatorial result. For any d > 1 a critical 
tree of depth d is a full ternary tree t of depth d with the following properties: 
(z) t G Tf; {a) each 1-vertex of t has exactly two children which are 1-vertices; 
and {Hi) each 0- vertex has only 0- vertices as children. In particular, there are 
no 1-leaves in 0-subtrees of t. Intuitively, a critical tree of depth d is a tree of 
depth d from T 3 which contains a full binary subtree of depth d which only has 
1-leaves and all other leaves are 0-leaves. In particular, a critical tree of depth 
d has 2'^ 1-leaves. A numbering N oi a critical tree t of depth d is an injective 
mapping of the 1-leaves of t into the set {0, . . . ,2^^ — 1}. All critical trees of a 
fixed size d are defined on the same tree domain r^. A mapping M which maps 
each leaf of to a subset of { 0 , . . . ,2'^ — 1 } with at most m elements is called an 
m-labeling. We say that an m-labeling M of Td is compatible with a numbering 
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of a critical tree t of depth d if, for each 1-leaf v of t, N{v) G M{v). We state 
the following lemma without proof. 

Lemma 5. For each m > 0 there is a d > 0 such that, for each m-labeling M 
of Td there is a critical tree of depth d for which there is no numbering that is 
compatible with M . 

We are now ready to prove the main result of this paper. 

Theorem 6. (a) There is no 1 -bounded (deterministic or non- deterministic) 
TWA which recognizes T^. 

(b) For each r > 0, there is is a regular tree language that can not be recognized 
by an r-restricted (deterministic or non-deterministic) TWA. 

Proof, (sketch) The main task is to prove (a). Statement (b) will follow by an 
easy generalization of that proof. Towards a contradiction assume there exists a 
non-deterministic TWA M' which recognizes T^. The proof consists of 3 main 
steps: 

— We transform M' into a TWA M which accepts exactly the same trees as 
M' but always rejects when it visits a 0-leaf. From this property we can 
conclude that if M accepts a certain tree t then it also accepts every tree 
which results from t by replacing 0-subtrees with 1-subtrees. 

— We show that there are two trees t, t' G T) with the same tree domain such 
that M has accepting computations for these trees which enter some vertex 
V in the same state s but with different “histories”. Here, the history consists 
of the information for which vertices w on the path from v to the root the 
computation visits the (only) 1-sibling of w before it visits v. This part of 
the proof makes use of Lemma 0 

— Finally, we show that these accepting computations can be combined into 
one accepting computation on a tree from Tg . 

Before we describe the construction of M, we introduce some notation. Let, 
for i G {1,... ,3}, (A^,_,) denote the set {fM',t,i I t G Tg^} {{fM',t,i I 

t G Tg }) of behaviour functions that M' can have on 1-subtrees (0-subtrees) 
that have child number i. A straightforward argument shows that M' can never 
recognize the set Tf when F)^, ^ ^ 0 for some i. Therefore, we can assume 

that F\j, ■ and F^, ■ are disjoint for all i. 

We turn to the construction of M. Intuitively, the idea is as follows. Whenever 
M' can go from a vertex v to vi then M can either do the same or, to prevent 
visiting a 0-leaf, it can guess that vi is the root of a 0-subtree. In the latter case 
instead of going down to vi it picks a behaviour function / G F^, ^ and enters a 
new state at v according to /. Formally, for each j < k, s G S and a G S — {0}, 

^MU,s,a) = Sh,{j,s,a)U [J (stay, /(s'))- 
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For each j < k and s G S', we define 6\^{j,s,0) = (stay,_L) where _L is a state 
from which no transition is possible. 

We have to show that M accepts exactly all trees in Tg. Let therefore t be 
from Tg, hence, by assumption, t is accepted by M' . 

Let C = Cq, Cg, . . . , be an accepting computation of M' on t. We show 
that there is also an accepting computation C = cq,... ,Cm of M on t. We 
construct C by suitably modifying C . Let v = wj be a 0-leaf of t, for some 
vertex w and some j. By the prerequisite on M' , v is at most visited once in 
C . If V is not visited at all, we do not need to modify C" with respect to v. 
If V is visited then there are successive configurations [w, Si], [u, Sj+i], [w, Si+ 2 ] 
in C . Here, we assume w.l.o.g. that M' moves in each step. In C we replace 
these 3 configurations by [w,Si+ 2 ]- This refiects a legal transition of M 

as it corresponds to the computation of M' on the 0-subtree which consists of a 
single 0-leaf. We end up with a legal accepting computation C on t. 

For the opposite direction, let C = cq, . . . , Cm be an accepting computation 
of M on a tree t £ T 3 . We construct a tree t' by replacing some of the 0-leaves 
of t by 0-trees and an accepting computation C' of M' on t' by modifying C 
accordingly. By the construction of t' it will follow that t G if and only if 
t' G Tg. Hence, as M' accepts t' it also accepts t. 

More formally, let [u, s], [u, s'] be two successive configurations from C such 
that it does not hold [w, s] [w,s']. Hence, this subcomputation is possible 

only by the new transitions introduced in M. By definition, there must be j < k, 
s" £ S, f £ F^, ■ and a 0-tree to such that [w, s] ^M',t [vj, s"j and /m', 4 o j (s") = 
s'. From this we can conclude that there exist configurations c{v, 1), . . . , c(u, V) 
such that [u, s], [vj, s"j, c(u, 1), - ■ ■ , c{v, 1), [u, s'] is a legal subcomputation of M' 
on the tree, denoted by t{v), in which the subtree rooted at vj is replaced by 
the 0-subtree to- If ^ G Tg then also t{v) G Tg as we only replaced a (0- or 1-) 
subtree by a 0-subtree. By inductively applying this argument we arrive at a 
tree t' and an accepting computation C on t'. Hence, by assumption, t' G Tg 
and therefore t G T^. 

It should be noted that the latter construction relies on the assumption that 
M' traverses each edge at most once. Otherwise, it could be the case that the 
configuration C visits v two times but the extension to C" makes use of two 
different 0-subtrees rooted at v. 

Let now t be a critical tree of depth d and let C = cq, . . . , c„ be an accept- 
ing computation of M on t. If C does not visit all 1-leaves then we can easily 
construct a tree t' £ Tg which is accepted by M. Hence, we assume that C 
visits each 1-leaf of t exactly once. Let v be such a 1-leaf of t and let Cj = [u, s] 
be the configuration of C which visits it. Let, for i G {1, . . . ,d}, Vi denote the 
vertex of depth i on the path from the root of t to v. As u is a 1-leaf, each Vi 
is a 1-vertex. Therefore, as t is critical, each vertex Vi has exactly one sibling u' 
which is a 1-vertex. For each i, v[ is visited in exactly one of the subcomputa- 
tions Co, . . . , Cj-i and c^+i, . . . , Cm- We define, for each 1-leaf v its history string 
z = ht^c{v) = Zi ■ ■ ■ Zd hy setting Zj = 1 if and only if u' is visited in cq, . . . , Cj_i. 
These history strings have a couple of nice properties which are straightforward 
to prove given the assumptions on M . 
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— The binary number btfi{v) represented hy z = ht c{v) (where Zd is the least 
significant bit) coincides with the number of 1-leaves that are visited in C 
before v. This follows from the restriction that the automaton can visit each 
subtree at most once. Hence, when a 1-vertex v at level i is entered, then 
all 1-leaves in the subtree of v have to be visited before the subtree is 
left. In this way, a 1 at the i-th bit of the history string corresponds to 2^^“® 
1-leaves that have been already visited. 

— Consequently, the function bt^c defines a numbering of t. In particular, each 
0- 1-string occurs exactly once as a history string of a 1-leaf v. 

We claim that, there exists a d, two critical trees t, t' of depth d, two accepting 
computations C,C of M on t,t' , respectively, and a leaf v of Td such that (?) 
u is a 1-leaf of t and t'; (ii) C and C visit v in the same state s; {in) and, 
hcA'^) ^ hc>,t'{v). 

Towards a contradiction assume that this claim is false. Let m be the number 
of states of M and let d be a number as given by Lemma 0 Let t and t' be two 
critical trees of depth d, let u be a common 1-leaf of t and t' and let C and C be 
accepting computations of t and t', respectively, which visit v in the same state s. 
The assumption implies that hc,t{v) = hc,t'{v) and therefore bc\t{v) = bc,t'{v). 
We can conclude that, for each vertex v of Td and each state s of M, there is only 
one number n(u, s) such that bc,t{A ~ n{v^s) for all critical trees t with 1-leaf 
V and all accepting computations which visit v in the state s. In other terms, 
there exists an m-labeling M of the leaves of Td such that, for each critical t and 
each accepting computation C on t the numbering bc,t is compatible with M . 
This contradicts Lemma O as desired. Therefore, the claim is proved. 

Let d, t, t', C, C and v be as given by the above claim. We complete the 
proof by constructing a tree to S Tg which is accepted by M. Let z = hc,t{v) 
and z' = hc',t'{v). Let j be minimal such that Zj ^ z'y We can assume w.l.o.g. 
that Zj = 0 and z' = 1. Let, for each i G {1, . . . , d}, Vi be defined as above, Wi 
be the 1-sibling of Vi in t and w{ be the 1-sibling of Vi in t' . We construct to as 
follows: (z) for each i < j, if Zj = 1 (i.e., Wi is visited before v in C) then we copy 
the subtrees rooted at the siblings of Vi from t; {ii) for each i < j, if Zi = 0 (i.e., 
w{ is visited before u in C") then we copy the subtrees rooted at the siblings of 
Vi from t' {Hi) at the siblings of Vj we root 0-subtrees (this assures that to is a 
0-tree); and, {iv) in the subtree rooted at Vj all leaves are labeled 1. 

Let C = Co, . . . , Cm, = Cq, . . . , c(j and let k and k' be such that Cfc = c)./ 
are the configurations in which v is visited. 

It is straightforward to check that 

— Co, . . . , Cfc is a valid subcomputation on to because all 1-leaves of t that are 
visited in co, . . . , Cfc are also 1-leaves in to', 

— c)., , . . . , c(j is a valid subcomputation on to because all 1-leaves of t' that are 
visited in c).,, . . . , c(j are also 1-leaves in to', hence 

— Co, . . . ,Ck = c'f., , . . . , c(j is an accepting computation on to, the desired con- 
tradiction. 

This concludes the proof of statement (a). 
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To prove (b) we use a slightly different set C/g of trees. These trees have an 
additional label, -I-. Inner vertices that are labeled with -|- have 3 children which 
are interpreted as threshold gates. Inner vertices that are not labeled with -|- 
have only 1 child. Hence, at a -I— vertex there are starting 3 paths which lead 
either to another -I— vertex or to a leaf. Now a -I— vertex is evaluated to 1, if at 
least 2 of its 3 descendants (-1— vertex or leaf) evaluate to 1. Intuitively, C/g is 
the same as Tg but the edges of trees in are replaced by paths in C/g . In fact, 
the proof of the fact that no r-restricted TWA recognizes C/g is almost word for 
word the proof given in (a) but in the trees that are used, each edge has to be 
replaced by a path of length r -|- 1. The old vertices are labeled with -I-, the new 
ones not. 
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Abstract. We study the determinization of transducers over infinite 
words. We consider transducers with all their states final. We give an 
effective characterization of sequential functions over infinite words. We 
also describe an algorithm to determinize transducers over infinite words. 



1 Introduction 

The aim of this paper is the study of determinization of transducers over infinite 
words, that is of machines realizing rational transductions. A transducer is a 
finite state automaton (or a finite state machine) whose edges are labeled by 
pairs of words taken in finite alphabets. The first component of each pair is called 
the input label. The second one the output label. The rational relation defined 
by a transducer is the set of pairs of words which are labels of an accepting 
path in the transducer. We assume that the relations defined by our transducers 
are functions which each string of the domain to a string. This is a decidable 
property |^. 

The study of transducers has many applications. Transducers are used to 
model coding schemes (compression schemes, convolutional coding schemes, cod- 
ing schemes for constrained channels, for instance). They are widely used in 
computer arithmetic |2| and in natural language processing H3|. Transducers 
are also used in programs analysis [0|. The determinization of a transducer is 
the construction of another transducer which defines the same function and has 
a deterministic (or right resolving) input automaton. Such transducers allow a 
sequential encoding and thus are called sequential transducers. 

The characterization of sequential functions on finite words was obtained by 
Choffrut | M-lb| . His proof contains implicitly an algorithm for determinization of 
a transducer. This algorithm has also been described by Mohri nn and Roche 
and Shabes [Q p. 223-233]. In this paper, we address the same problem for 
infinite words. We consider transducers and functions over infinite words and 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 561-^7^ 2000. 
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our transducers have all their states final. The reason why we assume that all 
states are final is that the case of transducers with final states seems to be much 
more complex. Indeed, the determinization of automata over infinite words is 
already very difficult M- In particular, it is not true that any rational set of 
infinite words is recognized by a deterministic automaton with final states and 
Biichi acceptance condition. Other accepting conditions, as the Muller condition 
for instance, must be used. 

We first give an effective characterization of sequential functions over infinite 
words. This characterization extends to infinite words the twinning property 
introduced by Choffrut ^j. We prove that a function is sequential if it is a 
continuous map whose domain can be recognized by a deterministic Biichi au- 
tomaton, and such that the transducer obtained after removing some special 
states has the twinning property. These conditions can be simplified in the case 
where the transducer has no cycling path with an empty output label. We use 
this characterization to describe an algorithm checking whether a function re- 
alized by a transducer is sequential. This algorithm becomes polynomial when 
the transducer has no cycling path with an empty output label. Finally, we give 
an algorithm to determinize a transducer. The algorithm is much more complex 
than in the case of finite words. 

The paper is organized as follows. Section 0 is devoted to basic notions of 
transducers and rational functions. We give in Sect. 0 a characterization of se- 
quential functions while the algorithm for determinization of transducers is de- 
scribed in Sect. 0 

2 Transducers 

In the sequel, A and B denote finite alphabets. The set of finite and right- 
infinite words over A are respectively denoted by A* and The empty word 
is denoted by e. The set is endowed with the usual topology induced by 
the following metric: the distance d{x,y) is equal 2“" where n is the minimum 
min{i I Xi ^ Ui}. In this paper, a function from A‘^ to B‘^ is said to be continuous 
iff it is continuous with respect to this topology. 

A transducer over A x B is composed of a finite set Q of states, a set E C 
QxA*xB*xQof edges and a set / C Q of initial states. An edge e = (p, u, v, q) 
from p to g is denoted by p q. The words u and v are called the input label 
and the output label. Thus, a transducer is the same object as an automaton, 
except that the labels of the edges are pairs of words instead of letters. In the 
literature, transducers also have a set of final states. In this paper, we only 
consider transducers all of which states are final and with Biichi acceptance 
condition. Any infinite path which starts at an initial state is then successful. 
We omit the set of final states in the notation. 

An infinite path in the transducer A is an infinite sequence 

Uq\vo Mlhl U2\V2 

qo qi <?2 93 • • • 
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of consecutive edges. Its input label is the word x = UqUiU2 ■ ■ ■ whereas its 
output label is the word y = VqViV2 • ■ • ■ The path is said to start at qg. 

An infinite path is then successful if it starts at an initial state. A pair {x, y) 
of infinite words is recognized by the transducer if it labels a successful path. 
A transducer defines then a relation R C A“ x The transducer computes 
a function if for any word x G there exists at most one word y G 
such that {x,y) G R. We call it the function realized by the transducer. Thus 
a transducer can be seen as a machine computing nondeterministically output 
words from input words. We denote by dom(/) the domain of the function /. A 
transducer that realizes a function can be transformed in an effective way in a 
transducer labelled in Ax B* that realizes the same function. These transducers 
are sometimes called real time transducers. 

Let T be a transducer. The underlying input automaton of T is obtained by 
omitting the output label of each edge. A transducer T is said to be sequential 
if it is labeled in Ax B* and if the following conditions are satisfied. 

— it has a unique initial state, 

— the underlying input automaton is deterministic. 

These conditions ensure that for each word x G A‘^, there is at most one word 
y G such that (x, y) is recognized by T. Thus, the relation computed by T is 
a partial function from A^ into A function is sequential if it can be realized 
by a sequential transducer. 



0|0 1|1 




Example 1. Let A = {0,1} be the binary alphabet. Consider the sequential 
transducer R pictured in Fig. H If the infinite word x is the binary expansion of 
a real number a G [0, 1), the output corresponding to x in T is the binary expan- 
sion of a/3. The transducer 7” realizes the division by 3 on binary expansions. 
The transducer obtained by exchanging the input and output labels of each edge 
realizes of course the multiplication by 3. However, this new transducer is not 
sequential. 
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3 Characterization of Sequential Functions 



In this section, we characterize functions realized by transducers with all states 
final that can be realized by sequential transducers. This characterization uses 
topological properties of the function and some twinning property of the trans- 
ducer. It extends the result of Choffrut m to infinite words. 

The characterization of the sequentiality is essentially based on the following 
notion introduced by Choffrut R p. 133] (see also Pl p. 128]). Two states q 
and q' of a transducer are said to be twinned if and only if for any pair of paths 

u\u v\v' 

I > q > q 

. u\u" , v\v" , 

I S> q S> q , 

where i and i' are two initial states, the output labels satisfy the following 
property. Either v' = v" = e or there exists a finite word w such that either 
u" = u'w and wv" = v'w, or u' = u"w and wv' = v"w. The latter case is 
equivalent to the following two conditions: 

(i) Iz;'] = Iz;"], 

/ fiJ // ffU) 

(n) u v = u V 

A transducer has the twinning property if any two states are twinned. 

Before stating the main result, we define a subset of states which play a par- 
ticular role in the sequel. We say that a state g of a transducer is constant if 
all infinite paths starting at this state have the same output label. This unique 
output is an ultimately periodic word. It should be noticed that any state acces- 
sible from a constant state is also constant. We now state the characterization 
of sequential functions. 

Proposition 1. Let f be a function realized by a transducer T . Let T' be the 
transducer obtained by removing from T all states which are constant. Then the 
function f is sequential if and only if the following three properties hold: 

— the domain of f can be recognized by a deterministic Biichi automaton, 

— the function f is continuous, 

— the transducer T' has the twinning property. 

Since the function / is realized by a transducer, the domain of / is rational. 
However, it is not true that any rational set of infinite words is recognized by a 
deterministic Biichi automaton. Landweber’s theorem states that a set of infinite 
words is recognized by a deterministic Biichi automaton if and only if it is 
rational and Gs PH. Recall that a set is said to be Gs is it is equal to a countable 
union of open sets for the usual topology of A‘^ . 

It is worth pointing out that the domain of a function realized by a transducer 
may be any rational set although it is supposed that all states of the transducer 
are final. The final states of a Biichi automaton can be encoded in the outputs of 
a transducer in the following way. Let A = {Q,E,I,F) be a Biichi automaton. 
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We construct a transducer T by adding an output to any transition of A. A 
transition p q of A becomes p q in where v is empty if p is not final 
and is equal to a fixed letter 6 if p is final. It is clear that the output of a path is 
infinite if and only if the path goes infinitely often through a final state. Thus the 
domain of the transducer T is the set recognized by A. For instance, the domain 
of a transducer may be not recognizable by a deterministic Biichi automaton 
as in the following example. It is however true that the domain is closed if the 
transducer has no cycling path with an empty output. 



a\e b\b 




Fig. 2. Transducer of Example El 



Example 2. The domain of the function / realized by the transducer of Fig. |21is 
the set (a + b)*b'^ of words having a finite number of a. The function / cannot 
be realized by a sequential transducer since its domain is not a Gs set. 

It must be also pointed out that a function realized by a transducer may be 
not continuous although it is supposed that all states of the transducer are final 
as it is shown in the following example. 

Example 3. The image of an infinite word x by the function / realized by the 
transducer of Fig.0is f{x) = o“ if a: has an infinite number of a and f{x) = a^b'^ 
if the number of a in a; is n. The function / is not continuous. For instance, the 
sequence Xn = b'^ab'^ converges to 6“ while /(a;„) = does not converge 
to f(b^) = 

Before describing the algorithm for determinization, we first study a partic- 
ular case. It turns out that the first two conditions of the proposition are due to 
the fact that the transducer T may have cycling paths with an empty output. 
If the transducer T has no cycling path with an empty output, the previous 
proposition can be stated in the following way. 

Proposition 2. Let f be a function realized by a transducer T which has no 
cycling path with an empty output. Let 'T' be the transducer obtained by removing 
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a|a b\b 




Fig. 3. Transducer of Example 0 



from T all states which are constant. Then the function f is sequential if and 
only if the transducer T' has the twinning property. 

The previous proposition can be directly deduced from Proposition H as fol- 
lows. If the transducer T has no cycling path with an empty output, any infinite 
path has an infinite output. Thus, an infinite word x belongs to the domain of / 
if and only if it is the input label of an infinite path in T. The domain of / 
is then a closed set. It is then recognized by a deterministic Biichi automaton 
whose all states are final. This automaton can be obtained by the usual subset 
construction on the input automaton of T. Furthermore, if the transducer T has 
no cycling path with an empty output, the function / is necessarily continuous. 

We now study the decidability of the conditions of Propositions ^ and m We 
have the following results. 

Proposition 3. It is decidable if a function f given by a transducer with all 
states final is sequential. Furthermore, if the transducer has no cycling path with 
an empty output, this can be decided in polynomial time. 

A Biichi automaton recognizing the domain of the function can be easily deduced 
from the transducer. It is then decidable if this set can be recognized by a 
deterministic Biichi automaton m thm 5.3]. However, this decision problem is 
NP-complete. 

It is decidable in polynomial time whether a function given by a transducer 
with final states is continuous H2|. The twinning property of a transducer is 
decidable in polynomial time Pj. 

4 Determinization of Transducers 

In this section, we describe an algorithm to determinize a transducer which sat- 
isfies the properties of Proposition [0 This algorithm proves that the conditions 
of the proposition are sufficient. The algorithm is exponential in the number of 
states of the transducer. 
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Let T = {Q, E, I) be a transducer labelled in AxB* that realizes a function /. 
Let E' be the transducer obtained by removing from E all states which are 
constant. We assume that E' has the twinning property. We denote by C the 
set of states which are constant. For a state q of C, we denote by yq, the unique 
output of q which is an ultimately periodic word. We suppose that the domain 
of / is recognized by the deterministic Biichi automaton A. This automaton is 
used in the constructed transducer to ensure that the output is infinite only 
when the input belongs to the domain of the function. 

We describe the deterministic transducer T> realizing the function /. Roughly 
speaking, this transducer is the synchronized product of the automaton A of the 
domain and of an automaton obtained by a variant of the subset construction 
applied on the transducer. In the usual subset contruction, a state of the deter- 
ministic automaton is a subset of states which memorizes all accessible states. 
In our variant of the subset construction, a state is a subset of pairs formed of 
a state and a word which is either finite of infinite. 

A state of 2? is a pair (p, P) where p is a state of A and P is a set containing 
two kinds of pairs. The first kind are pairs {q, z) where q belong to Q\C' and z is 
a finite word over B. The second kind are pairs (g, z) where q belongs to C and 
z is an ultimately periodic infinite word over B. We now describe the transitions 
of V. Let (p, P) be a state of V and let a be a letter. Let R be equal to the set 
defined as follows 

R = {{q', zw) \ q ^ C and 3{q, z) G P, q ^ C and q q' G E} 

U {{q', zwyqi) \ q' G C and 3{q, z) G P, q ^ C and q q' G E} 

U {{q', z) \ q' G C and 3(g, z) G P, q G C and q q' G E} 

We now define the transition from the state (p, P) input labeled by a. If R 
is empty, there is no transition from (p, P) input labeled by a. Otherwise, the 
output of this transition is the word v defined as follows. Let p p' be the 
transition in A from p labeled by a. If p' is not a final state of A, we define v as 
the empty word. If p' is a final state, we define v as the first letter of the words 
z if i? only contains pairs (q' , z) with q' G C and if all the infinite words z are 
equal. Otherwise, we define v as the longest common prefix of all the finite or 
infinite words z for (g', z) G R. The state P' is then defined as follows 

P' = {(g',z)|(g',uz)Gi?} 

There is then a transition (p, P) (p',P') in V. The initial state of V is 
the pair (z^, J) where is the initial state of A and where J = {{i,s) \ i G 
I and i ^ C}U {{i,yi) | z G / and i G C}. If the state p' is not final in A, the 
output of the transition from (p, P) to (p', P') is empty and the words z of the 
pairs (g, z) in P, may have a nonempty common prefix. We only keep in P the 
accessible part from the initial state. The transducer P has a deterministic input 
automaton. It turns out that the transducer P has a finite number of states. 
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The following proposition finally states that the sequential transducer T> is 
finite and that it is equivalent to the transducer T ■ Both transducers realize the 
same function over infinite words. 

Proposition 4. The sequential transducer T> has a finite number of states and 
it realizes the same function f as the transducer T ■ 

It is not straightforward that the transducer T> has actually a finite number of 
states. It must be proved that the finite words which occur as second component 
of the pairs in the states are bounded. It follows then that the infinite words 
occuring as second component of the pairs are suffixes of a finite number of 
ultimately periodic words. Therefore, there are finitely many such words. 

It must also be proved that the transducer T> realizes the same function as T . 
This follows mainly from the following lemma which states the key property of 
the edges in V. 

Lemma 1. Let u be a finite word. Let J) {p, P) be the unique path 
in T> with input label u from the initial state. Then, the state p is the unique 
state of A such that i_^ ^ p is a path in A and the set P is equal to 

P = {{q, z) \ 3 i — ' — > q in T such that v' = vz if q ^ C 

v'Vq = vz if q G C} 

This construction is illustrated by the following example. 



a\a a\e 




Fig. 4. Transducer of Example ^ 



Example 4- Consider the transducer pictured in Fig. El A deterministic Biichi 
automaton recognizing the domain is pictured in Fig. 0 If the algorithm for 
determinization is applied to this transducer, one gets the transducer pictured 
in Fig. El 

These determinizations do not preserve the dynamic properties of the trans- 
ducers as the locality of its output automaton. Recall that a finite automaton is 
local if any two biinfinite paths with the same label are equal. We mention that 
in Pj, an algorithm is given to determinize transducers over bi-infinite words 
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b a c a 




Fig. 5. A deterministic Biichi automaton for the domain 





Fig. 6. Determinization of the transducer of Fig. 0| 



that have a right closing input (or that are n-deterministic or deterministic with 
a finite delay in the input) and a local output (see also |EI) p. 143] and P] 
p. 110-115]). This algorithm preserves the locality of the output. These features 
are important for coding applications. 
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In the spring of ’99 my colleague Gert Smolka gave me a short introduction 
to constraint programming. During our discussion Gert emphasized that the 
search for efficient propagation algorithms leads to hard and well motivated 
questions in algorithmics. He pointed me to the papers |Reg94| by J.-G. Regin on 
“A filtering algorithm for constraints of difference in GSPs” and |dG9YIGGoo| 
by N. B. Guernalec and A. Golmerauer on “Narrowing a 2n-block of sortings 
in 0(n log n)”. I soon learned that Gert had pointed me to an extremely rich 
source of algorithmic problems which I am now exploring in cooperation with E. 
Althaus and S. Thiel from my research group, D. Duchier and J. Niehren from the 
programming systems lab, A. Koller from the computer linguistics department 
at the Universitat des Saarlandes, and with Nicolas Beldiceanu at SIGS. Some 
papers [MT()()iADK^notKlVllN()()| and implementations of propagators for the Oz 
system ( http : / / www . ps . uni- sb.de/ oz2/ ) have come out of the cooperation so 
far. 

The purpose of this talk is to get more researchers from the algorithms com- 
munity interested in the subject. I want to make the case that constraint program- 
ming offers a lot of very challenging algorithmic problems and that cooperations 
between the constraint programming and the algorithms community could be very 
beneficial to both communities. 



1 Constraint Programming 

I give a brief account of constraint programming based on jSmo9fi] . For more 
thorough discussions we refer the reader to !Smo96IMS98j . Examples of con- 
straint programming systems are ILOG, Ghip, Eclipse, and Oz. 

Gonstraint programming is a powerful programming paradigm, that allows 
one to formulate computational problems at a very high level. A computational 
problem is formulated as a constraint. A constraint is simply a conjunction of 
formulae of first-order logic. A solution is an assignment of values to the (free) 
variables which satisfies the constraint. It is the task of the constraint program- 
ming system to find a (all, a best) solution. The programmer only needs to 
specify the problerr0. Readers entirely unfamiliar with constraint programming 

^ In most constraint programming systems, the programmer can also specify the search 
strategy, but this is not important at the current level of discussion. 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 571- K7^ 2000. 
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should move on to the next section and have a look at the specification of the 
A'^-queens problem given there. 

Two types of constraints are distinguished: basic and non-basic constraints. 
This distinction is more a pragmatic distinction than a mathematical distinction. 
Basic constraints are constraints that can be handled efficiently and non-basic 
constraints are difficult constraints. 

The basic constraints must have two properties: 

— There is an efficient procedure which, given a collection of basic constraints, 
decides their satisfiability, or exhibits a solution, or exhibits all solutions. 

— Fixing the value of a variable turns a set of basic constraints into a set of 
basic constraints. 

We come to non-basic constraints. Non-basic constraints are difficult. They 
are simplified by means of iterated branching and propagation steps. A branch- 
ing step splits the current state into two: the first is obtained by adding some 
constraint and the second is obtained by adding its negation. Branching gener- 
ates a tree of states. Propagation operates on the current leaves of this tree and 
tries to simplify the states associated with them. 

A propagator is associated with every non-basic constraint C. Let B denote 
the conjunction of all basic constraints. The invocation of a propagator has one 
of the following effects. 

— The propagator may declare C obsolete. If C is declared obsolete, B must 
entail C, i.e., any assignment satisfying B must also satisfy C. 

— The propagator may declare inconsistency. In this case B A C must be un- 
sat isfi able. 

— The propagator advances to D. Here I? is a basic constraint that is entailed 
by i? A C and that is strictly stronger than B, i.e., D is not entailed by B. 

— The propagator cannot make any progresJE 

The computation in a leaf can stop when there is no further propagator or 
when a propagator declares inconsistency. In the former situation B defines the 
set of solutions. In the latter case, the leaf contributes no solution. 

When the computation has not stopped yet and no propagator can make any 
progress, a distribution step (branching step) is taken. We invent a constraint 
C and proceed from the unsolved space to two new spaces, the first obtained by 
adding a propagator for C and the second obtained by adding a propagator for 
the negation -i(7. 

2 An Example: The Alldiff Constraint 

We use the alldiff constraint to illustrate the connection between propagation 
and graph algorithms. 

^ Propagators must have some minimal capabilities. In particular, propagators must 
make progress if B determines a single assignment. In this case a propagator for C 
must either declare C obsolete or must declare inconsistency. 
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Consider the following situation. We have a set of variables X\, . . . , Xn- 
The basic constraints define for each variable Xi a finite set Vi of conceivable 
values. We also want the values of the variables to be pairwise distinct (alldiff 
constraint). The well known Wqueens problem is a toy example which shows 
the usefulness of the alldiff constraint. 

The goal is to place n queens on a n x n checker board. There can be at most 
one queen in each row, column, diagonal, and anti-diagonal. Assume that we 
place the f-th queen in position (z, Xi) (this definition makes sense since we have 
exactly one queen per row), 1 < z < n. Every queen excludes a diagonal (45° 
line) and a anti-diagonal (—45° line) If we identify a diagonal and anti-diagonal 
with the square which it uses in row 0, then a queen in position (z,xz) uses 
the diagonal yi = i + Xi and the anti-diagonal Zi = i — Xi. Thus we have the 
constraints: 



Xi G [1 .. rz], G [2 .. 2rz], Zi € [—n + I .. n — 1], 

y^ = Xi + i, Zi = i-Xi, 

Alldijf event {x i, .. . ,Xn), Alldiff event {y i, .. . ^yn), Alldi ff event {z\, .. . , z„) 

The first line defines the conceivable values for our variables, the second line 
defines relations between pairs of variables, and the third line defines three all- 
different constraints. Propagators for the equalities in the second line are easily 
designed. If a value v becomes impossible for x^, the value z; -I- z becomes impos- 
sible for z/i, and vice versa. 

Regin |Reg94| observed that matching theory leads to efficient propagators 
for the alldiff constraint. He suggested to view an alldiff constraint as a matching 
problem in a bipartite graph. On the left side of the bipartite graph there is a 
node for each variable and on the right side of the graph there is a node for each 
conceivable value. A variable x is connected to a value z; if z; is a possible for x. 

An alldiff constraint is satisfiable iff the graph G just defined has a matching 
in which all variables are matched (a variable-perfect matching). This is a well 
studied problem in algorithmics. There is a solution to the alldiff constraint in 
which variable x has value v iff there is variable-perfect matching containing the 
edge (x, v). This is also a question which has been discussed in the literature on 
matching. Puget gave the following answer. 

Let M be any matching involving all variables. We orient G as follows: Direct 
all edges in M from right to left and all edges not in M from left to right. An 
edge (x,zz) ^ M is part of a perfect matching if there is an alternating cycle 
containing it or an alternating path starting in a free node on the right side and 
ending in a matched node on the right hand side. In the directed version of G, the 
first kind of edge is an edge in any strongly connected component and the second 
kind of edge is any edge that is reachable from a free node on the right hand side. 
Thus given a matching, narrowing takes linear time. If no matching is available, 
narrowing takes time 0(-y/nm). Isn’t this a nice application of matching theory? 

In the two preceding sentences we talked about narrowing instead of propa- 
gation because propagation for the alldiff constraint amounts to narrowing the 
sets of potential values of the variables. 
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We can go further. Suppose we have just narrowed an alldiff constraint and 
then perform branching, e.g., we may partition the set of values of some variable 
into two. This will define two new bipartite graphs. Of course, we would like 
to work on these graphs without explicitly constructing them. That’s a typical 
problem in incremental graph algorithms. 

Frequently, the conceivable values of a variable form an interval of natural 
numbers. The bipartite graph can then be represented in space 0{n) and we may 
be interested in narrowing algorithms which reduce the sizes of these intervals. 
The question arises whether this can be done without constructing the bipar- 
tite graph defined above. Puget |Pug98| answered this question positively and 
described an 0(n log n) algorithm for obtaining bound consistency. The algo- 
rithm is again based on matching theory. In graph-theoretic terms the question 
amounts to computing perfect matchings and strongly components of so-called 
convex bipartite graph. In a convex bipartite graph, the nodes on the right side 
are linearly ordered and the neighbors of each node on the left side form an 
interval in the right side. The matching problem in bipartite convex graphs was 
first studied by In pTQ0| S. Thiel and I exploit the connection and ob- 

tain simplified and faster narrowing algorithms for the alldiff and the sortedness 
constraint. 



3 Further Reading 

In the previous section we tried to make the point that propagation rises inter- 
esting algorithmic questions. How can you learn more about these problems? 

— The most effective method is probably to talk to a researcher in the con- 
straint programming community. 

— Read fVHS+97j on strategic directions in constraint programming. 

— Check the proceedings of the conference on “Principles and Practice of Con- 
straint Programming (CP)” |MP 981,1 af'99| . 

— Check the journal “Constraints” (Kluwer Academic Publishers). 

— Read the paper by Nicolas Beldiceanu. The paper classifies con- 

straints using graph terminology. 
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Abstract. In this paper, we provide a method to safely store a doc- 
ument in perhaps the most challenging settings, a highly decentralized 
replicated storage system where up to half of the storage servers may 
incur arbitrary failures, including alterations to data stored in them. 
Using an error correcting code (ECC), e.g., a Reed-Solomon code, one 
can take n pieces of a document, replace each piece with another piece 
of size larger by a factor of ^ such that it is possible to recover the 
original set even when up to t of the larger pieces are altered. For t close 
to n/2 the space overhead of this scheme is close to n, and an ECC such 
as the Reed-Solomon code degenerates to a trivial replication code. 

We show a technique to reduce this large space overhead for high values 
of t. Our scheme blows up each piece by a factor slightly larger than two 
using an erasure code which makes it possible to recover the original set 
using n/2 — 0{n/d) of the pieces, where d « 80 is a fixed constant. Then 
we attach to each piece 0(dlogn/logd) additional bits to make it pos- 
sible to identify a large enough set of unmodified pieces, with negligible 
error probability, assuming that at least half the pieces are unmodified, 
and with low complexity. For values of t close to n/2 we achieve a large 
asymptotic space reduction over the best possible space blowup of any 
ECC in deterministic setting. Our approach makes use of a d-regular 
expander graph to compute the bits required for the identification of 
n/2 — 0{n/d) good pieces. 



1 Introduction 

In order to safeguard a document, the most simple solution is to replicate it, and 
to store the different copies in different places. This method, however, has two 
main drawbacks. First, the integrity of multiple replicas is harder to maintain, 
and second the required storage space grows linearly with the number of copies. 
In this paper, we provide a method to safely store a document that addresses 
both issues. First, our method guarantees integrity against arbitrary alterations, 
even malicious ones, in up to half of the storage servers. Second, the storage costs 
remain reasonable even in large systems, composed of hundreds or thousands of 
servers. 
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1.1 Our Contribution 

Our approach makes use of an erasure code (that can recover the information 
provided some pieces are lost but the ones that remain are required to be correct) 
and adds verification information to the code pieces. For our computations, we 
assume usage of IDA [IR,ab89j whose space blow-up is optimal, though other 
erasure codes, e.g. may be employed for efficiency with a slight 

sacrifice in space overhead. Let n be the number of pieces. We arrange the pieces 
in a graph, called the storage graph, such that each piece is a vertex, and an edge 
exists between vertices when they cross verify each other. Each vertex stores 
fingerprint information that transitively verifies every vertex up to distance k 
away in the graph, where A: is a parameter chosen at setup time. A fingerprint is 
a digest of fixed length representing the content of a piece. Fingerprints have the 
property that it is highly unlikely (and infeasible) to find two different pieces with 
the same fingerprint. Typically, a cryptographically secure hash function, e.g., 
SHAl |SHA1| . is used as a digest function. The transitive verification information 
takes only a factor k more space than a regular fingerprint of the adjacent pieces. 
Herein lies a large gain, since each vertex verifies a neighborhood in the graph of 
radius k, which grows exponentially. The total storage cost is 0{kdn), where d 
is a bound on the storage graph degree. When kd <C n, this cost is a significant 
improvement over previous methods. The complexity of our recovery and storage 
algorithms is 0{kdn) in addition to the time required for decoding and encoding 
the erasure code of choice. Our algorithm needs to compute, in the worst case, 
only kdn digests. The range of parameters which will be of particular interest 
for us is when d is constant and k is O(logn). 

The storage graph we employ has the property that even when up to t < n/2 
of its vertices are removed, a sufficiently large component of size 0(n) remains 
connected with diameter < k. For this, we make use of known constructions of 
expander graphs [I jPS 86| and prove that the required properties hold in them. 
That is, we prove that if up to t < n/2 vertices are removed from an expander 
like that in |l jPS 86| . then there remains a component of size n/2 — 0{n/d) with 
diameter 0{logn/logd). This result is of independent interest and may have 
other applications. Furthermore it can be extended to a setup when more than 
half the vertices are removed. 

The retrieval algorithm selects a vertex at random and collects all the ver- 
tices that are verified by it, by a simple breadth-first-search. We show that this 
selection procedure needs to be repeated only an expected constant number 
of times until it collects a linear set of correct vertices. The total number of 
fingerprinting computations is bounded by O {dn log n / log d). The computation 
of fingerprints dominates the time overhead of our retrieval algorithm over the 
decoding complexity of the erasure code we use. 



1.2 Related Work and Alternative Approaches 

The most prevalent approach for achieving resilience to arbitrary server cor- 
ruption is the state machine approach jLam79IScli^. which numerous systems 
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employ. Using this approach in the context of secure file storage, every server 
stores a full replica of the file and processes every update on it. The alteration 
of data stored by servers can be masked by obtaining t + 1 identical replicas, 
where t is presumed to be a bound on the total number of corrupted replicas. 
Unfortunately, this method has a high overhead in storing full copies of the file 
at each replica. 

When alterations to stored data are not of concern, erasure codes solve the 
problem. For example, Rabin presents a solution, called Information 

Dispersal Algorithm (IDA), which allows to transform a document of size s into 
n pieces of size s/m, where m is a parameter chosen by the user, such that 
the document can be reconstructed from any m pieces. Since the total amount 
of space taken by m pieces is exactly s, the space overhead of IDA is clearly 
optimal. However, if any of the obtained pieces is altered, the integrity of the 
reconstructed document may be compromised. Moreover, a user obtaining such 
an erroneous document has no way of detecting that an error has occurred, and 
may simply return erroneous results undetectably. 

To overcome this problem, it is necessary to add redundant information to 
pieces when they are stored, that indicates when some other pieces(s) are altered. 
A simple approach is to store a fingerprint of the entire document with each 
piece. To recover the file, first one gets the correct fingerprint from a majority 
of the pieces, and then checks combinations of pieces for a file with the same 
fingerprint. However, this may lead to prohibitive computations in searching for 
a right combination of unaltered pieces. 

To obtain a feasible solution one could use an error correcting code (ECC). 
An ECC takes n pieces and blows up each piece with some additional informa- 
tion such that it is possible to recover all the pieces provided that n — t are 
uncorrupted, for any t < n/2. The minimal space-blowup factor of any ECC is 
n/{n — 2t) where n is the number of pieces. There are well known ECCs that 
achieve this optimal space overhead, e.g., Reed-Solomon codes. Unfortunately, 
when t approaches n/2, the space blows up by a factor of n (pieces) and this 
degenerates to simple replication. 

Instead of using an ECC on the pieces themselves one can apply it to a 
shorter sequence of digests of the pieces, thereby reducing the space overhead at 
the expense of getting only a probabilistic guarantee for recovery. For example, 
the Secure IDA method in computes a fingerprint for each piece, and 

stores the vector of fingerprints using an ECC. To recover the document, first 
the vector of fingerprints is recovered, and then each piece is checked against its 
fingerprint. The space blow-up factor for storing a document with this method is 
n/{n — t) for the IDA pieces, and an additional space for pieces of the fingerprints 
vector, blown up by a factor of n/(n — 2t). Here, too, when t approaches n/2, 
the fingerprints vector is fully replicated. The space for the fingerprints vector 
depends only on n and the digest function used, and does not depend on the 
document length. Nevertheless, this space could be quite prohibitive when n is 
large. To illustrate this, suppose a file size is 1 Mega-Byte, fingerprints are 160 
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bits, n = 1000 and t = 499. Then Secure IDA stores roughly 1000 x 160 extra 
bits, or Ri 2QK Bytes, with every IDA piece of 1MB/(1000 — 499) ~ 2K Bytes. 

We can reduce the large blowup factor of ECC (either on the pieces them- 
selves or on the fingerprints) by using a list decoding algorithm. The general idea 
is to use an ECC which is able to correct less errors than the maximum possible. 
In case the number of errors is larger than what the ECC is capable of fixing the 
list decoding algorithm will generate a small list of possible decodings of which 
we will be able to choose the right one with high probability. Polynomial list 
decoding algorithms for Reed-Solomon codes have been recently discovered by 
Sudan fSii97). More specifically, a Reed-Solomon code codes K blocks into N 
blocks, such that any two codewords differ in at least N — K + 1 blocks (the 
distance). If at most {N — K)/2 blocks are altered, there is a single codeword 
that is closest to the altered data (i.e., differs from it in fewer than (iV — K)/2 
different blocks) . This is the highest error for which Reed-Solomon is guaranteed 
to retrieve the original document. As already mentioned, for this error rate to 
reach half the blocks, Reed-Solomon blows up a stored document by a factor 
close to its size, thus triviliazing to full replicationQ 

In case the number of errors is larger than half the code distance a Reed- 
Solomon code can be used to recover a list of all possible decodings. The num- 
ber of possible decodings is constant as long as the number of errors is less 
than N — V NK (see j(IR,S95| l and the problem of finding the list is known 
as the list-decoding problem. Using techniques introduced by Sudan |Su97j and 
subsequently improved in iimnni and such a list is produced with a 

randomized polynomial time algorithm. In its most efficient form their 

scheme corrects up to \N — y/NK — 1] errors. The scheme can be used to ad- 
dress our problem as follows: A document of length K blocks is encoded into 
n = N = 4K blocks using a Reed-Solomon code. In addition, we store with each 
block a digest H of the full document. To retrieve the document when up to 
half of the blocks may be altered, we recover H from the majority, and employ 
Guruswami and Sudan’s list-decoding method to obtain a list of possible decod- 
ings, which we compare against H to retrieve the original document with high 
probability. The space blow-up of this method is constant (= 4). The drawback 
of this scheme is the complexity of the retrieval algorithm which employs rather 
complicated methods, such as polynomial factorization, and has complexity cu- 
bic in nil By comparison, our retrieval method is simpler (using only hashing 
and comparisons), and runs in 0(n log(n)) time. We use a completely different 
approach whose building blocks may have other applications. 

A comparison of the efficiency of our method when half the system may be 
faulty with the various known approaches is given in Table Q below. 



^ In previous paragraphs we thought of the document as decomposed into n pieces 
where n is fixed and the encoding increases the piece size. Here we think of the piece 
size as fixed and the encoding translates K pieces to N . 

^ The method by Roth and Ruckenstein IRRblJj has computation complexity 
0(n^ log^ n) but needs twice as much space. 
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Table 1. Comparison of methods: Storing document of size s on n servers when up to 
t = n/2 may be faulty. 



Method 


Space overhead per server 


Store timef 


Retrieve timef 


Our method 


(2 -1- e)s/n + 0(log n) 


0(n log n) 


0(n logn) 


Simple replication 


s 


none 


none 


Reed-Solomon 


s 


none 


none 


Secure IDA 


2s/n + 0{n) 


0(n) 


0{n) 


List decoding 


Asjn 


none 


O(n^) 



I In addition to underlying coding/decoding time of the corresponding erasure or 

error correcting code. 



Going back to our basic motivation, the need for scalable and survivable stor- 
age is reinforced in numerous recent systems that support information sharing 
in highly decentralized settings. Examples are the Eternity service [An 7m . a 
survivable digital document repository, SFS I1V1K98I . a secure file system for a 
wide area network. Fleet LV1K99I . a survivable and scalable data replication sys- 
tem, a Byzantine file system of Castro et al. j( ;i;99j . and IBM’s Evault (GG.I97j . 
a storage system that employs Rabin’s IDA to achieve survivable storage with 
reasonable storage burden. The verification information stored in these systems 
to guard against possible alteration of pieces does not scale to large system sizes. 
Our methods are most suitable for all the systems mentioned above and others, 
where scaling is a necessity. 

The methods presented in this paper are concerned with the integrity of 
file storage and retrieval. Other aspects of data security are orthogonal to ours. 
Specifically, methods for preserving the secrecy of file contents in replicated 
systems have been proposed, e.g., in |HT88IAE90| . such that the collusion of up 
to t faulty servers cannot reveal the contents of the information stored. These 
methods use secret sharing techniques that can be combined with our approach 
to achieve secrecy. 



2 Preliminaries 

The goal of this work is to provide two functions. Share and Reconstruct. Func- 
tion Share takes a document x and produces n pieces denoted by Share{x, 1), . . ., 
Share{x,n). Function Reconstruct recovers the document with high probability 
despite arbitrary alterations in up to a threshold t = of the pieces. 

Our algorithms make use of a cryptographically secure hash function H (such 
asSHAl |SHA1p . For any value v, in an unlimited range, H{v) has fixed size (in 
bits). We assume that it is computationally infeasible to find two different values 
V and v' such that H{v) = H{v'). Typically, setting \H\ to 160 bits suffices to 
guarantee this today, e.g., with SHAl, and hence we will assume this. 

We also use Rabin’s IDA jB,ab89| . At a high level, IDA takes a data- value 
X and converts it into n pieces, IDA{x,l), ..., IDA{x,n), such that recovery of 
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X is possible from any combination of m pieces, and such that the total space 
taken by every m components is precisely |a;| (optimal). 

3 Share 

The Share transformation takes a document x and produces n pieces Share{x, 1), 
. . Share{x,n) to be stored on n corresponding servers. The goal of the Share 
transformation is to allow retrieval of the original document x despite arbitrary 
corruption of up to t of the pieces. To cope with such alterations, we will trans- 
form X into n pieces and store verification information on each piece in such a 
way that discrimination between correct and incorrect pieces can be achieved at 
a low cost, while maintaining a low storage overhead. The challenge is to min- 
imize the storage requirements to enable our scheme to scale up to very large 
systems. The secrecy of the document is not the main concern, and can be added 
using standard methods. 

Our solution first transforms x using IDA into n pieces, IDA{x,l), . . ., 
IDA{x,n), such that x can be restored from any subset of (f — en) pieces 
(e will be determined in Section lO- To safeguard against alteration of pieces, 
we add to each piece verification information as follows. Pieces are arranged in 
a store graph ST = {V,E) on n vertices (which will also be specified in Section 
O). We denote the set of vertices adjacent to a vertex i in ST by N{i). Each 
vertex in ST represents one piece, and it stores k levels of verification informa- 
tion. For every vertex i, we define level-£ verification information Vf recursively 
as follows: = IDA{x,i) and for I > 1 

yf = . . . , where ji < . . . < j|Ar(j)| are the neighbors of i. 

In other words the level j verification information stored with piece i is the 
tuple of hashes of the level j — 1 verification information stored at its neighbors. 
Each piece stores k levels of verification information that intuitively verify the 
pieces up to distance k away from i in the graph. (The parameter k will be 
determined in the next section). In addition, it stores the hash of the whole file. 

The total space taken by each piece of a document x stored with our method 
is at most 

\H\{dk + l) + |a;|/(| - en) , 

where d is the maximum degree of any node in ST. When dk = o(n), we get 
a significant improvement over ECCs. Note also that since the space overhead 
is proportional to the product of d and k, we can trade increased degree with 
decreased diameter, and vice versa. Hence, to be useful for storage and retrieval, 
we need the storage graph ST to have the following features: 

— Low degree: The degree of vertex i in ST determines the storage overhead 
of the Tth piece. 

— Good expansion: The expansion of ST determines the number of vertices 
that are at distance k from any particular vertex or group of vertices, and 
hence, the number of vertices that can be verified by them. 
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During retrieval, up to t vertices of ST may be corrupted, and hence, some set 
D of edges incident with t < vertices are removed. Notice that we do not 

know who are the t corrupt vertices and get to see a graph after deleting only the 
edges in D that are a subset of the edges incident with those t corrupt vertices. 
Nevertheless, using A:-transitive verification, we know that every neighborhood 
of diameter k is either all correct or all corrupt. Hence, our graph construction 
needs to guarantee that after the removal of the set D of edges incident with t 
vertices from S'T, there remains a set of ^ — en good vertices that are connected 
with a low diameter. We proceed to show such a construction. 

3.1 Determining ST 

We consider the problem of finding a storage graph ST such that when an 
arbitrary set D of edges incident with a set of < = malicious vertices is 

deleted there is still a large component with small diameter in the remaining part. 
We handle this case by picking a graph such that after the deletion of any set 
of t vertices we are guaranteed to have a set of almost n — t vertices connected 
with a small diameter, say fc, where we stipulate that k = 0(log n/ log d). In 
the following we show that well known expander graphs satisfy our 

requirements. Namely, after deleting an arbitrary set of t vertices, the 

remaining set of vertices contains a subgraph of size ^ ~ and of diameter 

k = 0(log n/ log d), where d > 80 is a constant. The main result proved in the 
remainder of this section is therefore as follows: 

Theorem 1. In an LPS expander }LPS86f with d > 80, if one deletes half of 
the vertices then there is a vertex w such that n/2 — 0{n/d) of the remaining 
vertices are at distance 0(logn/logd) from w. 

We shall use the following result of Alon et al. |ATW/95I . 

Theorem 2. Let G = (V, E) be a d regular graph such that the absolute values 
of the eigenvalues of its adjacency matrix but the largest are no greater than A. 
For a set B f-V , \B\ = p\V\, let P be the set of walks of length k (edges) that 
are all contained in B. Then, 

<|P|<|S|d'=(^M+^(l-/r)) . 

Proof (of Theorem^. Fix a set i? C V, \B\ = ^n. For a vertex v G B denote 
by Pv the set of walks of length k that start at v and never leave B. If follows 
from the lower bound in Theorem El that there is a vertex w G B for which 




Denote by C the set of vertices occurring on walks in P^. We claim that if 

^ ^ logn 

'08 ( 2(4o-.|) ) 
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then \C\ > cn. Otherwise, \C\ < cn, and from the upper bound in theorem|2|we 
obtain that 



\Pw\ < cnd^ 



c+^(l 




( 2 ) 



Combining the lower bound in (0 and the upper bound in 

logn 



we obtain that 



k < 



log 



1 -- 



2(c+|(l-c)) 



in contradiction with our choice of k. 

In particular, for c = 3 we obtain that there is a vertex w £ B such that 
there are at least 3 n. vertices within distance 



k = 



logn 




(3) 



from w in B. If we take LPS expander then A = 2\/d — I. It is easy to check 

that for A = 2y/d — I and d > 80, one has , , , — r > 1- Therefore, we obtain 

6 ( 1 ) + 2 | 

that if G is an LPS expander with d > 80 then there is a vertex w £ B such 
that 3 (^) n of the vertices of B are at distance at most 0(log n/ log d) from w 
in B. (Notice that the constant hidden by the big-0 approaches 2 as d goes to 
infinity. ) 

From Lemma 2.4 in Chapter 9 of |ASK92j it follows that if between two sets 
B and C such that \B\ = bn and \C\ = cn there is no edge then 

|C|6^d^ < A^6(l - b)n, 



so be < (3)^(1 — b). 

From this we get the following consequences: 

1. There must be an edge between any set of size 3 and any set of size 
(5 — ^) n if A/d < 1/4. 

2. There is an edge between every two sets of size ^n. 

3. There is an edge between any set of size (| — ^)n and any set of size e/d if 

> f^\ f l/2+A/d \ 

For LPS expanders with d > 80, we have that A/d < 1/4 and furthermore the 
condition in 3 holds for e > 11. Therefore we obtain that the set of vertices within 
distance fc + 3 from w where k is defined as in Elis of size at least (1/2 — ll/d)n. 
(Notice that e is smaller and goes to 4 as d goes to infinity.) □ 
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4 Reconstruct 

The goal of the Reconstruct transformation is to take n pieces retrieved from 
a storage system, up to t of which may be arbitrarily altered from the origi- 
nally stored values for some document x, and to return the original document 
X. That is, given a set of pieces {ri,...,r„} containing at least n — t original 
pieces {Share{x,ii), Share{x,in~t)}, we want that Reconstruct{ri, ...,rn) = 
X (with high probability). Note that, we need to keep Reconstruct feasible as n 
grows despite the uncertainty concerning up to t of the pieces. Hence, we cannot 
exhaustively scan all combinations oi n — t pieces to find a correct one. 

Consider the set {ri, ..., r„} of retrieved pieces. Every unaltered piece con- 
tains the original values H{x) which were stored in it. We say that 

ri,Tj are consistent if they are connected in ST and all levels of verification 
information in them are consistent. This set induces a subgraph R of ST, with 
all the edges between inconsistent vertices removed. Our construction of Share 
guarantees the existence of a connected component in R of size n/2 — 0(n/d) 
whose diameter is at most k. Our recovery algorithm finds this set with an ex- 
pected linear number of steps. Here we let N^{I) = N{I) be the set of vertices 
adjacent to vertices in / in the subgraph R of ST. Also we denote by iV^(/) the 
set of vertices within distance no greater than fc to a vertex in I. We prove the 
following lemma about R: 

Lemma 1. Let I be a set of fully unaltered vertices. Then every vertex in 
where y < k, has its first k — y levels of information unaltered. 

Proof. By induction on y. For the basis of the induction, we examine the im- 
mediate neighborhood of I. Since the first fc-levels of verification information 
in each ri G I are unaltered, for each immediate neighbor Vj G N{I) the hash 
values H{Vq),. . . stored by some Vi G I are unaltered and hence, (by 

the cryptographic assumption) Vq , ... , V'^_^ are unaltered in j. 

For the induction step, assume that the lemma holds for y' < y. Hence, every 
vertex in has its k — {y — l)-levels of verification information unaltered. 

But since N'^{I) = fV(fV^“^(/)) by an argument as for the base case stated 
above using T = and fc' = fc — (j/ — 1), we obtain that each vertex in 

N{T) has k' — 1 = k — {y — 1) — 1 levels of verification information unaltered, as 
desired. □ 

As an immediate corollary of Lemma Q we obtain that if iL is a connected 
set in R of diameter no greater than k then either all the IDA shares in K are 
correct, or all vertices of K are altered. 

4.1 The Algorithm 

The algorithm Reconstruct goes as follows: 

1. Let S = {ri, ...,r„}. 

2. Let h be the value of H{x) that occurs in pieces in S. 
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3. Pick a node G S' at random; 

4. If \N'^{R,ri)\ < n/2 — n/d set S = S \ {vi} and go back to step 3. 

5. Get all the pieces from N^{R,ri), reconstruct a document x using IDA and 
check that H{x) = h. If so, return x else set S = S \ {{ri} U N'^{R, n)} and 
go back to step 3. 

As shown, the storage graph contains, after removing t faulty nodes, a con- 
nected component of size 0{n) and diameter O(logn). Hence, in an expected 
constant number of steps, the retrieval algorithm above will terminate and return 
the correct response. 

5 A Secure Storage System 

The application context of our work is a secure storage system. The system 
consists of a set S' of n servers denoted si, ..., Sn, and a distinct set of clients ac- 
cessing them. Correct servers remain alive and follow their specification. Faulty 
servers however may experience arbitrary (Byzantine) faults, i.e., in the extreme, 
they can act maliciously in a coordinated way and arbitrarily deviate from their 
prescribed protocols (ranging from not responding, to changing their behavior 
and modifying data stored on them) . Throughout the run of our protocols, how- 
ever, we assume a bound t = on the number of faulty servers. Clients are 

assumed to be correct, and each client may communicate with any server over 
a reliable, authenticated communication channel. That is, a client c receives a 
message m from a correct server s if and only if s sent m, and likewise s receives 
m' from c iff c sent m' . Furthermore, we assume a known upper bound r on 
the duration of a round-trip exchange between a client and a correct server, i.e., 
a client receives a response to message m sent to a correct server s within at 
most T delay. In our protocols, we need not make any assumption nor employ 
communication among the servers. 

The storage system provides a pair of protocols, store and retrieve, whereby 
clients can store a document x at the servers and retrieve x from them despite ar- 
bitrary failures to up to t servers. More precisely, the store and retrieve protocols 
are as follows: 

store: For a client to store x, it sends a message (store, a;) to each server in S, 
and waits for acknowledgment from n — t servers, 
retrieve: For a client to retrieve the contents of x, it contacts each server in S 
with a request (retrieve). It waits for a period of t to collect a set of responses 
A = {oslsgs, where each Og is either a response of the form (piece, Xg), if 
s responded in time, or T if the timeout expired before s’s response was 
received. The client returns Reconstruct(A) as the retrieved content. 

Each server Si that receives a message (store, x) stores locally the value 
Share{x,i). And when a server s receives a (retrieve) request it promptly re- 
sponds with the currently stored piece. 
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A few points are worthy of noting in this description. First, due to our failure 
model, a client may receive more than n — t responses to a query, albeit some 
undetectably corrupted. By assumption, though, the retrieved set A will contain 
n — t original pieces, and hence, our Share and Reconstruct algorithms above 
guarantee that Reconstruct(A) yields the original stored content of x. Second, 
the store protocol assumes that the computation of each piece is done by each 
individual server. This is done for simplicity of the exposition. Another possibility 
would be for a client or some gateway between the clients and servers to first 
compute the pieces Share{x,l), ■■■, Share{x,n) and then send each piece to its 
corresponding server. The latter form saves computation time by performing it 
only once at the client (or the gateway) , and comes at the expense of increasing 
the load on the client during a store operation. Both forms can be supported (in 
a similar manner to ^nm), and are orthogonal to the discussion here. Third, 
during a retrieve operation the client may optimize access to the servers, e.g., by 
contacting an initial set of n — t servers, which will suffice in the normal faultless 
state of the system, and dynamically increasing it only as needed. Such methods 
are extensively discussed in the relevant literature on distributed systems and 
replicated databases, and are not the main focus of the work at hand. Finally, for 
simplicity, we have assumed that store and retrieve operations do not overlap, 
though in practice, concurrency control mechanisms must be applied to enforce 
this. 



6 Discussion 

Our research leaves open a number of issues. First, our constants, in particular, 
the degree d, are rather large, and hence the results are beneficial for very large 
systems only. We are looking for graph constructions facilitating our methods 
for smaller system sizes. One such family of candidates are finite projective 
geometries EHSI- 

Second, our adversarial assumption is rather strong, namely, fully adaptive 
malicious adversary, and it might be possible to improve efficiency if we adopt 
a weaker adversarial model. In particular, one might accept in practice a non- 
adaptive adversarial model, that is, one that gives the adversary t randomly 
chosen servers to corrupt. Early on in this work, we envisioned making use of a 
random graph-G(n,p)-in which each edge (i,j) exists with probability p. It is 
known that for such random graphs, connectivity occurs at p = {log n + oj{n)) / n 
(with diameter d = 0(logn/loglogn)) and that the diameter becomes 2 at p = 
0(\/(log n)/n) (See e.g. |Bollobas85| l. Due to the independent selection of edges, 
any subgraph G' of G{n,p), induced by removal of t randomly selected vertices, 
is itself a random graph. Hence, it is also connected with a small diameter. This 
provides a viable solution for smaller system sizes than our current results, albeit 
for the weaker adversarial model and while incurring an increased probability 
of error (that is, the probability that the resulting subgraph after removal of 
faulty vertices is not connected with small diameter) . Other candidates to acheive 
better results in the weaker adversarial model are random regular graphs. 
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Abstract. We consider two natural generalizations of the notion of 
transversal to a finite hypergraph, arising in data-mining and machine 
learning, the so called multiple and partial transversals. We show that the 
hypergraphs of all multiple and all partial transversals are dual- bounded 
in the sense that in both cases, the size of the dual hypergraph is bounded 
by a polynomial in the cardinality and the length of description of the 
input hypergraph. Our bounds are based on new inequalities of extremal 
set theory and threshold logic, which may be of independent interest. 
We also show that the problems of generating all multiple and all partial 
transversals of an arbitrary hypergraph are polynomial-time reducible 
to the well-known dualization problem of hypergraphs. As a corollary, 
we obtain incremental quasi-polynomial-time algorithms for both of the 
above problems, as well as for the generation of all the minimal Boolean 
solutions for an arbitrary monotone system of linear inequalities. Thus, 
it is unlikely that these problems are NP-hard. 



1 Introduction 

In this paper we consider some problems involving the generation of all subsets 
of a finite set satisfying certain conditions. The most well-known problem of 
this type, the generation of all minimal transversals, has applications in combi- 
natorics graph theory artificial intelligence nm, game theory 

Ham, reliability theory [8^38] , database theory mm and learning theory 

m- 

Given a finite set V of n = | V| points, and a hypergraph (set family) A 2^ , 
a subset B C V is called a transversal of the family Aii Ap\ B ^ % for all sets 
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A G A] it is called a minimal transversal if no proper subset of B is a transversal 
of A. The hypergraph A‘^ consisting of all minimal transversals of A is called 
the dual (or transversal) hypergraph of A. It is easy to see that if ^ G ^ is not 
minimal in A, i.e. ii A £ A for some A' C A, then {A \ = A'^. We can 

assume therefore that all sets in A are minimal, i.e. that the hypergraph A is 
Sperner. (The dual hypergraph A‘^ is Sperner by definition.) It is then easy to 
verify that {A’^)'^ = A and ^ = Ubg.4-^ 

For a subset X C ]/ let = V \ X denote the complement of X, and let 
A‘^ = S A} be the complementary hypergraph of A. Then e.g. A‘^‘^ consists 

of all maximal subsets containing no hyperedge of A, while the hypergraph A‘^‘^ 
consists of all minimal subsets of V which are not contained in any hyperedge 
of A. 

1.1 Multiple Transversals 

Given a hypergraph .A C 2^ and a non-negative weight bA associated with every 
hyperedge A G A, a, subset X is called a multiple transversal (or b-transversal), 
if |X n A| > bA holds for all A G A. The family of all minimal 6-transversals 
then can also be viewed as the family of support sets of minimal feasible binary 
solutions to the system of inequalities 

Ax > b, (1) 

where the rows of A = Aa are exactly the characteristic vectors of the hyper- 
edges A G A, and the corresponding component of b is equal to bA- Clearly, 
b = (1,1,....,!) corresponds to the case of (ordinary) transversals. Viewing the 
columns of A as characteristic vectors of sets, dO is also known as a set covering 
problem. 

Generalizing further and giving up the binary nature of A as well, we shall 
consider the family T = T Ap> of (support sets of) all minimal feasible binary 
vectors to (P) for a given m x n-matrix A and a given m-vector b. We assume 
that m is a monotone system of inequalities: if a; G B" satisfies (P) then any 
vector y gW^ such that y > x is also feasible, where B = {0, 1}. For instance, (P) 
is monotone if A is non-negative. Note that for a monotone system du the dual 
hypergraph = T\ b is (the complementary hypergraph of) the collection 
of (supports of) all maximal infeasible vectors for iQI). In the sequel we shall 
assume that the hypergraph iFA,b is represented by the system m and not given 
explicitly, i.e., by a list of all its hyperedges. In particular, this means that the 
generation of iFA,b and its dual b are both non-trivial. 

1.2 Partial Transversals 

Given a hypergraph A C 2^ and a non-negative threshold k < |A|, a subset 
V C V is said to be a partial transversal, or more precisely, a k-transversal, to 
the family A if it intersects all but at most k of the subsets of A, i.e. if |{A G 
A\AnX = 0}| < k. Let us denote by the family of all minimal fc-transversals 
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of A. Clearly, 0-transversals are exactly the standard transversals, defined above, 
i.e. A'^° = A'^. In what follows we shall assume that the hypergraph A'^*’ is 
represented by a list of all the edges of A along with the value of k. 

Let us define a k-union from A as the union of some k distinct subsets of A, 
and let A^'‘ denote the family of all minimal A:-unions of A. In other words, A^'^ 
is the family of all the minimal subsets of A which contain at least k distinct 
hyperedges of A. By the above definitions, /c-union and fc-transversal families 
both are Sperner (even if the input hypergraph A is not). It is also easy to see 
that the families of all minimal fc-transversals and (fc-|- l)-unions are in fact dual, 
i.e. , A‘‘’’ = for k = 0,1,..., 1^1 - 1. 

2 Dualization 

In this paper we shall study problems related to the generation of certain types 
of transversals of hypergraphs. The simplest and most well-known case is, when 
a Sperner hypergraph A is given by an explicit list of its edges, and the problem 
is the generation of its transversal hypergraph A‘^. This problem, known as 
dualization, can be stated as follows: 

Dualization(^, ,B): Given explicitly a finite hypergraph A and a family of its 

minimal transversals B C A‘^, either prove that B = A‘^, or find a new 

transversal X G A‘^ \ B. 

Clearly, we can generate all hyperedges of A‘^ by initializing B — % and iteratively 
solving the above problem \A‘^ \ + 1 times. Note also that in general, \A^ can be 
exponentially large both in \A\ and \V\. For this reason, the complexity of gen- 
erating A‘^ is customarily measured in the input and output sizes. In particular, 
A’^ is said to be generated in incremental polynomial time if Dualization(^, B) 
can be solved in time polynomial in \V\, |^| and \B\. 

Dualization is one of those intriguing problems, the true complexity of which 
is not yet known. For many special classes of hypergraphs it can be solved 
efficiently. For example, if the sizes of all the hyperedges of A are limited by a 
constant r, then dualization can be executed in incremental polynomial time, (see 
e.g. iscn]). In the quadratic case, i.e. when r = 2, there are even more efficient 
dualization algorithms that run with polynomial delay, i.e. in poly{\V\, |v4|) time, 
where B is systematically enlarged from B = 9 during the generation process of 
A^ (see e.g. UEm2|). Efficient algorithms exist also for the dualization of 2- 
monotonic, threshold, matroid, read-bounded, acyclic and some other classes of 
hypergraphs (see e.g. f;H7l9l2;H2fiE7]l. Though no incremental polynomial time 
algorithm is known for the general case, an incremental quasi-polynomial time 
one exists (see ^I]). This algorithm solves Dualization(^, ,B) in 0{nm) -\- 
TO°(i°g'") time, where n = |E| and m =\A\-\- \B\ (see also [101 for more detail). 
We should stress that by the above cited result the dualization problem is very 
unlikely to be NP-hard. 

In this paper we shall mainly consider transversals to hypergraphs not given 
explicitly, but represented in some implicit way, e.g. generating the partial or 
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multiple transversals of a hypergraph. To be more precise, we shall say that a 
Sperner hypergraph ^ C 2^^ on n = \V\ vertices is represented by a superset 
oraele O if for any vertex set X C ]/, D can decide whether or not X contains 
a hyperedge of Q. In what follows, we do not distinguish the superset oracle 
and the input description D oi Q. We assume that the length |l)| of the input 
description of G is at least n and denote by Tg = Tg(|D|) the worst-case running 
time of the oracle on any superset query “Does X contains a hyperedge of QT' . 
In particular, D is polynomial-time if Tg < poly{\D\). Clearly, D also specifies 
(a superset oracle for) the dual hypergraph 

The main problem we consider then can be formulated as follows: 

Gen(C/): Given a hypergraph G represented by a polynomial time superset oracle 
D, and an explicit list of hyperedges "H C either prove that H = G, or 
find a new hyperedge in G\T~L. 

We list below several simple examples for dual pairs of hypergraphs, represented 
by a polynomial time superset oracle: 

1 ) Multiple transversals. Let ([Q) be a monotone system of linear inequalities, 
and let G = Xa.s be the hypergraph introduced in Section II . II Then the input 
description D is {A, b). Clearly, for any input set X C I/, we can decide whether 
X contains a hyperedge of XA,b by checking the feasibility of (the characteristic 
vector of) X for (P). 

2) Partial transversals. Let G = be the hypergraph of the minimal k- 
transversals of a family A, as in Section fT"^ Then G is given by the threshold 
value k and a complete list of all hyperedges of A, i.e., D ~ (fc, A). For a subset 
X C V, determining whether X contains a hyperedge in A'^'^ is equivalent to 
checking if X is intersecting at least \A\ — k hyperedges of A. 

3) Monotone Boolean formulae. Let / be a (V, A)-formula with n variables and 
let G = Af he the supporting sets of all the minimal true vectors for /. Then 
D ^ f and the superset oracle checks if (the characteristic vector of) X C 1/ 
satisfies /. The dual hypergraph G‘^ is the set of all the (complements to the 
support sets of) maximal false vectors of /. 

4 ) Two terminal connectivity. Consider a digraph P = (X, A) with a source s 
and a sink t, and let G be the set of s — t paths, i.e., minimal subsets of arcs 
that connect s and t. Then D ~ T, and for a given arc set X C A, the superset 
oracle can use breadth-first search to check the reachability of t from s via a 
path consisting of arcs in X. Note that the dual hypergraph G‘^ is the set of all 
s — t cuts, i.e., minimal subsets of arcs that disconnect s and t. 

5) Belly’s systems of polyhedr a. Consider a family of n convex polyhedra Pi C 

K’', i € V, and let G denote the minimal subfamilies with no point in common. 
Then G‘^‘’ is the family of all maximal subfamilies with a nonempty intersection. 
(In particular, if Pi, . . . , Pn are the facets of a convex polytope Q, then these 
maximal subfamilies are those facets which have a vertex in common, and hence 
G‘^‘’ corresponds to the set of vertices of Q.) We have D ~ (Pi, . . . , P„) and, given 
subsets of polytopes X QV , the superset oracle can use linear programming to 
check whether Hi exPi 0- 
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3 Main Results 

Let us point out first that problems Gen(C/) and Gen(C/'^) can be of very different 
complexity, in general, e.g. it is known that problem GEN(iF^ is NP-hard even 
for binary matrices A (see EB)- contrast to this, we can show that the tasks 
of generating multiple and ordinary transversals are polynomially related. 

Theorem 1. Problem GEN(.7^A,f)) is polytime reducible to dualization. 

In particular, for any monotone system of linear inequalities o, all minimal 
binary solutions of o can be generated in quasi-polynomial incremental time. 

Remark 1. Let us add that though generating all maximal infeasible binary 
points for © is NP-hard, there exists a polynomial randomized scheme for 
nearly uniform sampling from the set of all binary infeasible points for (JQ) . Such 
a scheme can be obtained by combining the algorithm m for approximating the 
size of set-unions with the rapidly mixing random walk pni on the binary cube 
truncated by a single linear inequality. On the other hand, a similar randomized 
scheme for nearly uniform sampling from within the set of all binary (or all mini- 
mal binary) solutions to a given monotone system is unlikely to exist, since it 
would imply that any NP-complete problem can be solved in polynomial time by 
a randomized algorithm with arbitrarily small one-sided failure probability. This 
can be shown, by using the amplification technique of im, already for systems 
m with two non-zero coefficients per inequality, see e.g. uni for more detail. 

Similarly, we can show that generating (partial) fc-transversals is also poly- 
nomially equivalent to the generation of ordinary transversals. 

Theorem 2. Problem Gen(^‘^'=) is polytime reducible to dualization. 

Let us add that the dual problem of generating (fc -|- l)-unions, Gen(^“'=+i) is 
known to be NP-hard (see f22).l 

In the rest of the paper we present first the two main tools for our proofs: In 
Section 0we discuss the method of joint generation for a pair of dual hypergraphs 
defined via a superset oracle, and show that for a polynomial-time superset 
oracle the above problem reduces to dualization. Next, in Section 0 we present 
some inequalities of threshold logic and extremal set theory showing that for the 
above cases the method of joint generation can efficiently be applied, yielding 
thus Theorems Q] and |21 In Section we present some of the proofs. Due to 
space limitations, we cannot include all proofs here, and refer the reader to the 
technical report 0 for further details. Finally, Section Q discusses some of the 
related set families arising in data mining and machine learning, and Section 0 
contains our concluding remarks. 

4 Joint Generation of Dual Bounded Hypergraphs 

One of the main ingredients in our proofs is the method of joint generation of 
the edges of a dual pair of hypergraphs: 
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Gen{0,0‘^): Given hypergraphs G and G‘^, represented by a superset oracle D, 

and two explicitly listed set families A G Q and B C either prove that 

{A,B) = {G,G'^),or find a new set in (C/ \ ^) U \ i3). 

For the special case when A = G and D is the explicit list of all the sets in 
G, we obtain the dualization problem as stated in Section 0 More generally, as 
observed in f4l 1 5j . problem Gen(C/,C/'^) can be reduced in polynomial time to 
dualization for any polynomial-time superset oracle D: 

Proposition 1 ( |4|15] i. Problem Gen{G,G‘^) can be solved in n{poly{\A\, \B\) 
-|-Ts(|D|)) -I- Tduai time, where Tduai denotes the time required to solve problem 

Dualization(M, B) . 

In particular, for any (quasi-)polynomial-time oracle D, problem Gen(C/, G‘^) can 
be solved in quasi-polynomial time. Thus, for each of the 5 examples mentioned 
above we can jointly generate all the hyperedges of {G, G‘^) in incremental quasi- 
polynomial time. 

Joint generation may not be efficient however, for solving either of the prob- 
lems Gen(C/) or Gen(C/‘^) separately. For instance, as shown in PS|, both prob- 
lems Gen(C/) and Gen( 5‘^) are NP-hard for examples 3-5. In fact, in example 3 
these problems are NP-hard already for (V, A)-formulae of depth 3. (If the depth 
is only 2 then the formula is either a GNF or a DNF and we get exactly dual- 
ization.) The simple reason is that we do not control which of the families G\A 
and G‘^\B contain the new hyperedge produced by the algorithm. Suppose, we 
want to generate G, and the family G‘^ is exponentially larger than G- Then, if 
we are unlucky, we can get hyperedges of G with exponential delay, while getting 
large subfamilies of G‘^ (which are not needed at all) in between. 

Such a problem will not arise and Gen{G,G'^) can be used to produce G 
efficiently, if the size of G‘^ is polynomially limited in the size of G and in the input 
size |D|, i.e. when there exists a polynomial p such that \G'^ I <p(|P|,|D|,|a|).We 
call such Sperner hypergraphs G dual-bounded. For dual-bounded hypergraphs, 
we can generate both G and G'^ by calling Gen{G ,G‘^) iteratively \G'^\ + \G\ < 
poly{\V\, |D|, |C/|) times, and hence all the hyperedges of G can be obtained in 
total quasi-polynomial time. 

This approach however, may still be inefficient incrementally, i.e., for obtain- 
ing a single hyperedge of G as required in problem Gen(C/). It is easy to see that 
the decision problem: “Given a family A G G, determine whether A — f/?” is 
polynomially reducible to dualization for any dual-bounded hypergraphs repre- 
sented by a polynomial-time superset oracle. If A is much smaller than G, getting 
a new hyperedge in C/ \ M still may require exponentially many (in |M|) calls to 

GEN(a,e^). 

Let us call a Sperner hypergraph G uniformly dual-bounded if there exists a 
polynomial p such that 



n G‘^\ < p{\V\, |D|, \n\) for all hypergraphs UGG- 



( 2 ) 
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Proposition 2. Problem Gen(C/) is polytime reducible to dualization for any 
uniformly dual-bounded hypergraph Q defined by a polynomial-time superset or- 
acle. 

Proof. Given a proper subfamily A of G, we wish to find a new hyperedge G\A. 
Start with the given A and B — % and call Gen(C/, G'^) repeatedly until it outputs 
a required hyperedge. If this takes t calls, then t—1 hyperedges in G'^ H A‘^ are 
produced. Since G is uniformly dual-bounded, we have t — 1 < \A'^ H < 
p(|y|, |D|, 1^1), and hence the statement follows by Proposition Q □ 

5 Bounding Partial and Multiple Transversals 

Theorems n] and 13 will follow from Proposition |3 by showing that the Sperner 
hypergraphs fpA.b and A'^'‘ are both uniformly dual-bounded: 

Theorem 3. For any monotone system m of m linear inequalities in n binary 
variables, and for any H C fpA.b we have 

n fpA.bl ^ mnlHl for any H C ipA.b- (3) 

In particular, \T\ < mn\TA,b \ follows. To prove the above theorem we shall 

use the following lemma. 

Lemma 1. Let h : B" — >■ B &e a monotone Boolean function, w = {wi, . . . , 

Wn) S M”, and t G R such that the inequality wx WiXi > t holds for all 

binary vectors x G B" for which h(x) = 1. Then, if 0, we have 

I maxP(h) n {a; I wx < t}| < ^ ex, 

fcGmin T{h) 



where max. F(h) C B" is the set of all maximal false points of h, minT(h) C B" 
is the set of all minimal true points of h, and e is the vector of all ones. 

In particular, | maxF(h)n{x \ wx < t}| < n\ minT(/i)| must hold. If the function 
h is threshold (h{x) = 1 <t4> wx > t), then |maxF(/i)| < n|minT(/i)| is a well- 
known inequality (see p,4l9l'/!til'/! 7] b Lemma[Dthus extends this result to arbitrary 
monotone functions. 

For the hypergraph of partial transversals we can prove the following: 

Theorem 4. For any hypergraph A Q 2^ of m = \A\ hyperedges, threshold 
0 < k < m — 1, and subfamily FL C A'^^ we have 

\n^nA'^>‘+^\<2\n\^ + {m-k-2)\n\. (4) 

In particular, -|- (m — k — follows. To prove the 

above theorem, we need the following combinatorial inequality. 
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Lemma 2. Let A ^ 2^ be a hypergraph on \V\ = n vertices with \A\ > 2 
hyperedges such that 

|A| > fc + 1 for all A G A, and \B\ < k for all B G A'^ ^ (T) 

where k is a given threshold and A’^ is the family of all the maximal subsets ofV 
which can be obtained as the intersection of two distinct hyperedges of A. Then 

1^1 < (n-fc)l^'^l. (5) 

Note that A^ is a Sperner family by its definition, and that condition CO implies 
the same for A. Note also that the thresholdness condition (pQ) is essential for 
the validity of the lemma - without (CQ) the size of A can be exponentially 
larger than that of A^. There are examples (see 0) for Sperner hypergraphs 
A for which |^'^| = n/5 and |^| = 3”/^ + 2n/5 or |^'^| = (n — 2)^/9 and 
|_ 4 | ^ 3("-2)/3 + 2(n- 2)/3. 

The proof of Theorem 0]in the next section makes further use of the following 
modification of Lemma |3 

Lemma 3. Let S = {S'!,... ,5'^} and T = {Ti,... ,Tp\ be two non-empty 
multi-hypergraphs on an n-element vertex set V , such that 

I'S'zl > fc + 1 for all i = 1, . . . , a, and \Ti\ < k for all Z = 1, . . . , /3, 

where k < n is a given threshold. Suppose that for any two distinct indices 
^ < i < 3 ^ Oi, the intersection Si fl Sj is contained in some hyperedge Ti G T. 
Then S is a Sperner hypergraph and 

a<20^ + (3{n-k-2). (6) 



6 Proofs 

Due to space limitations, we include only the proofs of Lemmas 00 and Theorem 
0 The interested reader can find all the proofs and some generalizations in the 
technical report 

Proof of Lemma 0, Let n = \V\, m = |^| > 2 and p = We wish to show 
that 

m <{n — k)p. 

We prove the lemma by induction on k. Clearly, if fc = 0, then the lemma holds 
for all n, since in this case the thresholdness condition (T) implies that all sets 
in A are pairwise disjoint and consequently, m < n — k. Assume hence A: > 1. 

For a vertex v GV, let At, = {A \ {u} \ A G A, v G A} he the sets in A that 
contain v, restricted to the base set P\{u}. Next, let A’f denote the family of all 
maximal pairwise intersections of distinct subsets in A„. If my = |At,| > 2 then 
Ay and Ay satisfy the assumptions of the lemma with n' = n — 1 and k' = k — 1. 
Hence 



my <{n- k)py, 



( 7 ) 
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where = \A^\. Let Vi = {v &V \ niy = 1} and let V 2 = {v G V\rriy > 2} = 
^ union of all the (maximal) intersections in Note that in 

view of o we have s = maxxe^n |X| < k. Summing up the left-hand sides 
of inequalities GD for all G V 2 we obtain: X^tieV '2 = Exg^I^I - 1^1 1 ^ 
(fc -|- l)m — |Vi|. Summing up the right-hand sides of the same inequalities over 
f G 1^2 we get (n - k) J2v(^V2 Pv = (n - k) I^I < “ k)ps. Hence 



m < 



iVil , 

k+1 



p{n — k)s 
k + 1 



Since |X| = s for at least one set in A’^, we have |Vi| < n — s. To complete the 
inductive proof it remains to show that n — s < {k + 1 — s){n — k)p. This follows 
from the inequality n — s < (fc-|-l — s)(n — fc). □ 

Proof of Lemma The fact that S is Sperner is trivial. Assume w.l.o.g. that 
T is also a Sperner hypergraph and that all the hyperedges of T have cardinality 
exactly fc. Extend V by introducing two new vertices ai and 6; for every Ti G 7~, 
and let Al be the hypergraph obtained by adding to S two new sets T; U{aj} and 
T) U {6;} for each ^ = 1, . . . , /3. The hypergraph A! has m' = a -I- 2/3 hyperedges 
and n' = n -I- 2/3 vertices. It is easy to see that (A')'^ = T. Applying Lemma El 
with p' = l(A')'^ I = [3 and fc' = fc, we obtain m! < (n' — fc')p', which is equivalent 
to (0. □ 

Proof of Theorem H Given a hypergraph A C 2^^ on an n-element vertex set 
V, let c/> : 2^ —>■ 2^ he the monotone mapping which assigns to a set X C V 
the subset 4>{X) C {1, 2, ..., m} of indices of all those hyperedges of A which are 
contained in X, i.e. 4>{X) = {i \ Ai C X, 1 < i < m}. Note that for any two sets 
X,Y C ]/ we have the identity 



(l){x) r\ (l){Y) = (j){xr\Y). 



( 8 ) 



Let Ti = {Hi,... ,Hjs} C 2^ be an arbitrary non-empty collection of fc- 
transversals for A. To show that TL satisfies inequality (0, consider the multi- 
hypergraph T = {4>{H{), . . . , (j){Hp)}. Since each Hi G H. is a, fc-transversal to A, 
the complement of Hi contains at most fc hyperedges of A. Hence |</>(iLf)| < fc, 
i.e., the size of each hyperedge of T is at most fc. 

Let the hypergraph TL'^ n A^’‘+^ consist of a hyperedges, say fl = 

{Al, . . . ,Xa} C 2^ . Consider the multi-hypergraph S = {(j){Xi), . . . , <^(Aq,)}. 
Note that Ai, . . . , Aq, are (fc -I- l)-unions, and hence each hyperedge of S has 
size at least fc -I- 1. 

Let us now show that for any two distinct indices 1 < i < j < a, the 
intersection of the hyperedges <^(Ai) and <^(A^) is contained in a hyperedge 
of T. In view of Q we have to show that 4>{Xi fl Xj) is contained in some 
hyperedge of T. To this end, observe that 'H'* fl is a Sperner hypergraph 

and hence Xi fl Xj is a proper subset of A^. However, Xi G fl is a 

minimal transversal to TL. For this reason, Xi nXj misses a hyperedge of TL, say 
Hi. Equivalently, Xi fl Xj C Hf which implies 4>{Xi fl A^-) C 4>{Hf) G T. Now 
inequality (0 and Theorem 0 readily follow from Lemma 0 □ 
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7 Related Set-Families 

The notion of frequent sets appears in the data-mining literature, see HEl, 
and can be related naturally to the families considered above. More precisely, 
following a definition of EH, given a (0, l)-matrix and a threshold k, a subset 
of the columns is called frequent if there are at least k rows having a 1 entry 
in each of the corresponding positions. The problems of generating all maximal 
frequent sets and their duals, the so called minimal infrequent sets (for a given 
binary matrix) were proposed, and the complexity of the corresponding decision 
problems were asked in m- Results of 1221 imply that it is NP-hard to determine 
whether a family of maximal frequent sets is incomplete, while our results prove 
that generating all minimal infrequent sets polynomially reduces to dualization. 

Since the family A‘^'‘ consists of all the minimal fc-transversals to A, i.e. 
subsets of V which are disjoint from at most k hyperedges of A, the family 
consists of all the minimal subsets of V which are contained in at most k 
hyperedges of A. It is easy to recognize that these are the minimal infrequent sets 
in a matrix, the rows of which are the characteristic vectors of the hyperedges of 
A. Furthermore, the family A‘^'‘‘^ consists of all the maximal subsets of V, which 
are supersets of at most k hyperedges of A. 

Due to our results above, all these families can be generated e.g. in incre- 
mental quasi-polynomial time. 

In the special case, if M is a quadratic set-family, i.e. if all hyperedges of A 
are of size 2, the family A can also be interpreted as the edge set of a graph G 
on vertex set V. Then, is also known as the family of the so called fairly 
independent sets of the graph G, i.e. all the vertex subsets which induce at most 
k edges (see EH.) 

As it was defined above, the family consists of all the minimal fc-unions 
of A, i.e. all minimal subsets of V which contain at least k hyperedges of A, 
and hence the family A™'' consists of all the minimal subsets which contain at 
least k hyperedges of Thus, the family consists of all the maximal k- 

intersections, i.e. maximal subsets of V which are subsets of at least k hyperedges 
of A. These sets can be recognized as the maximal frequent sets in a matrix, the 
rows of which are the characteristic vectors of the hyperedges of A. Finally, the 
family consists of all the maximal subsets of V which are disjoint from at 
least k hyperedges of A. 

As it follows from the mentioned results (see e.g. f21l22j l. generating all 
hyperedges for each of these families is NP-hard , unless k (or | A| — /c) is bounded 
by a constant. 

8 General Closing Remarks 

In this paper we considered the problems of generating all partial and all multiple 
transversals. Both problems are formally more general than dualization, but in 
fact both are polynomially equivalent to it because the corresponding pairs of 
hypergraphs are uniformly dual-bounded. 
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It might be tempting to look for a common generalization of these notions, 
and of these results. However, the following attempt to combine partial and 
multiple transversals fails. For instance, generating all the minimal partial binary 
solutions to a system of inequalities Ax > 6 is NP-hard, even if A is binary and 
b = (2, 2, ..., 2). To show this we can use arguments analogous to those of 
Consider the well-known NP-hard problem of determining whether a given graph 
G = (V,E) contains an independent vertex set of size t, where t > 2 is a given 
threshold. Introduce |P| -I- I binary variables xq and Xy, v € V, and write t 
inequalities Xu + Xy > 2 for each edge e = (u, v) € E, followed by one inequality 
Xo + Xy > 2, for each v & V .It is easily seen that the characteristic vector of any 
edge e = (it, v) is a minimal binary solution satisfying at least t inequalities of 
the resulting system. Deciding whether there are other minimal binary solutions 
satisfying > t inequalities of the system is equivalent to determining whether G 
has an independent set of size t. 
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Abstract. Cut-free proofs in Herbelin’s sequent calculus are in 1-1 cor- 
respondence with normal natural deduction proofs. For this reason Her- 
belin’s sequent calculus has been considered a privileged middle-point 
between L-systems and natural deduction. However, this bijection does 
not extend to proofs containing cuts and Herbelin observed that his cut- 
elimination procedure is not isomorphic to /3-reduction. 

In this paper we equip Herbelin’s system with rewrite rules which, at the 
same time; (1) complete in a sense the cut elimination procedure firstly 
proposed by Herbelin; and (2) perform the intuitionistic “fragment” of 
the tq-protocol - a cut-elimination procedure for classical logic defined 
by Danos, Joinet and Schellinx. Moreover we identify the subcalculus of 
our system which is isomorphic to natural deduction, the isomorphism 
being with respect not only to proofs but also to normalisation. 

Our results show, for the implicational fragment of intuitionistic logic, 
how to embed natural deduction in the much wider world of sequent 
calculus and what a particular cut-elimination procedure normalisation 
is. 



1 Introduction 



In his paper about a “A-calculus structure” isomorphic to a “Gentzen-style se- 
quent calculus structure” ^ , Herbelin proposed to define a A- like calculus corre- 
sponding to a L J-like sequent calculus in the same way as A-calculus corresponds 
to natural deduction. 

Herbelin starts by refining the simplest, many-one assignment of terms to 
sequent calculus proofs, usually denoted by Lp and that comes from the theory of 
the relationship between sequent calculus and natural deduction 11011519111131 . 
The refinement is to consider a restriction of LJ called LJT (respec. a term 
calculus called A-calculus) whose cut-free proofs (respec. cut-free terms) are in 
1-1 correspondence with normal natural deduction proofs (respec. n ormal A- 
terms) . 
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Dyckhoff and Pinto showed the merits of the cut-free fragment of LJT 
as a proof-theoretical tool and emphasized its privileged intermediate position 
between sequent calculus and natural deduction. The purpose of this paper is to 
define an intermediate system of this kind for proofs possibly containing cuts and 
a cut-elimination procedure which together give an isomorphic copy of natural 
deduction in sequent calculus format, with respect not only to proofs but also to 
proof normalisation. 

Full LJT is not the solution to this problem. The bijection with natural 
deduction does not extend to proofs with cuts and Herbelin observed that his 
cut-elimination procedure fails to implement full /3-reduction (it just implements 
a strategy) . 

The A-calculus includes an operator of explicit substitution so that local 
steps of cut permutation can be written as elementary steps of substitution 
propagation (calculi of explicit substitution for similar purposes can be found 
in PEnni). Instead of making substitution explicit, we perform the complete 
upwards permutation of a cut in a single step of reduction by a global operation. 
This is inspired in the so-called tq-protocol, a cut-elimination procedure for classi 
cal “coloured” proofs defined by Danos, Joinet and Schellinx p. 

We equip LJT with a reduction procedure of this kind which completes, in a 
sense, LJT's original procedure, obtaining a sequent calculus and corresponding 
A-calculus which we call i? J+ and Aj, respectively. We prove that HJ^ is just 
performing the intuitionistic “fragment” of the tq-protocol and that the typable 
subcalculus of A J is strongly normalising and confluent. Furthermore, we identify 
natural subsystems HJ and Xh such that HJ (respe c. Xh) is isomorphic, in 
the strong sense required above, to NJ (respec. A). In particular, both A^ and 
Xh implement full /3-reduction. 

The reader finds in Tabled (where inc stands for inclusion) a map of systems 
and translations which will appear in the following. 

Notations and Terminology 

We just treat intuitionistic implicational logic (implication written as d). Baren- 
dregt’s convention applies to all calculi in this paper. A context is a consistent 
set of declarations x : A. By consistent we mean that \i x \ A and x : B are in a 
context, then A = B. Contexts are ranged over by T. We write x G T meaning 
X : A G T for some A. T,x : A denotes the consistent union T U {cc : A}, which 
means that, if x is already declared in T, then it is declared with type A. 

We call left (respec. right) subderivation of a cut instance the subderivation 
in which the cutformula occurs in the RHS (respec. LHS) of its endsequent. Such 
cutformula is called the left (respec. right) cutformula of that instance. 

2 Background 

2.1 Herbelin’s LJT and A-Calculus 

We refer to jOj for details about LJT and A. We adopt some simplification of 
syntax introduced in |2j (but the numbering of cuts is different!). TableElpresents 
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Table 1. Map of systems and translations 



LJT/X 



inc 



t 

LJT+/\^ 



Lf 



p ■ inc ^ 

— HJ+/\+ ^ — T HJ/\h ^ — T NJ/\ 

p (-)■ 6 * 



the syntax and typing rules of A (and thus the inference rules of LJT) and the 
reduction rules of A which define the cut-elimination procedure of LJT (Der 
stands for Dereliction). 

In A there are two kinds of expressions: terms and lists of terms. The term 
x[ti,...,tn] can be seen as the A-term (...(ccti)...t„) but with the advantage of 
having the head-variable at the surface. t{x := f} and l{x := v} are explicit 
substitution operators and U' is an explicit append of list (cf. the reduction 
rules). Notice that in t{x := u} and l{x := v}, x is bound and x ^ FV{v). 

There are two kinds of derivable sequents: T; — \- t : B and T; A \- I : B. 
In both there is a distinguished position in the LHS called stoup. The crucial 
restriction of LJT is that the rule L D introduces Ad B in the stoup and B has 
to be in the stoup of the right subderivation’s endsequent. Forget for a second 
rules cut2 and cut^. In this case (in particular in cut-free LJT), besides Ax, no 
rule can introduce a formula in the stoup and thus the last rule o f the right 
subderivation of an instance of L D is again L D and so on until Ax is reached. 

There are two kinds of cuts (head-cut and mid-cut) according to whether the 
right cutformula is in the stoup or not. Notice that in the reduction rules there 
are no permutation of cuts. 



2.2 LJ* and the Intuitionistic “tq-Protocol” 

Table 0 presents the sequent calculus LJ* and a corresponding, nameless term 
calculus in which a cut-elimination procedure is expressed. 

We leave to the reader to provide the definitions of free and bound variable 
in a term L. The idea is that, in L{x,Li, (y)L2), x occurs free and y bound. By 
Barendregt’s convention, neither y occurs free in L\ nor x occurs bound in L\ or 
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Table 2. Herbelin’s LJT and A-calculus 



u, V, t ;:= xl I Xx.t \ tl \ t{x := v} 
1,1' ::= []\t-.:l\ll'\l{x := v} 



^^r-AV- 0 : A 



Der 



r-A\-l-.B 
r,x \ A\ — \- xl : B 



r ^ A r-,B\-l:C 

^ ^ r-,AZ) B\-t-.-.l:C 



RD 



r,x \ A-, — \- 1 •. B 
T; — h Xx.t ■. A Z) B 



xir 



mid-cuts 



, B: — \- V \ A r, X ■. A\ — \- t : B 

r-,-^t{x-.= v}-.B * ' 



r 



^ ^ r,x : A;C I : B . ^ 

cut 2 r-,C h l{x ■.= v} ■. B ^ 



head-cuts 



cuts 



r-,-\-t-.A r-,A\-l-.B 
r-,-\-ti-. B 



cuti 



r-c\-i:A r-A\-i'-.B 
r-C\- ll' : B 



(All) {Xx.t){u :: 1) — ^ t{x ;= u}l 

(A12) t[] t 

(A21) {xl)l' x{W), r 7 ^ Q 

(A31) {u :: 1)1' u (W) 

(A32) 0^ ^ 

(A41) {xl){x := v} vl{x := u} 

(A42) {yl){x := v} yl{x := v},y^x 

(A43) (Ay.rt)!® := «} — >■ Xy.u{x := v} 

(A51) (m :: l){x := «} — >■ u{x := u} :: l{x := w} 
(A52) []{x := u} ^ D 



L2, although X may occur free in Li or L2 (meaning that an implicit contraction 
is happening). 

The cut-elimination procedure is a “fragment” of the so-called tq-protocol, 
a strongly normalising and confluent procedure for classical, “coloured” proofs 
defined in p. To be precise, it is the restriction of the tq-protocol to intuitionistic, 
t-coloured proofs in which an “orientation” of the multiplicative connective D 
has been fixed. 
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Table 3. LJ* 



L Ax(®) I Cut(L, {y)L) \ L(®, L, {y)L) \ R{{x)L) 



Ax 



r,x : A\- Ax(a;) : A 



. rhLi:A r,y.BRL2-.C 
^ r,x : Ad B\- l{x,Li,{y)L 2 ) : 



r 



R D r, X : A \- L : B ^ p 
^ r\- R{{x)L) : Ad B ^ 



Cut 



r\- Li: A r,y : AR L2 : C 
rhCut(Li,(y)L2) :C 



y^r 



Structural step Si : 

Cut(Li, (x)L 2 ) 1/2 [El/®], if X is not freshly and logically introduced in L 2 

Structural step S 2 : 

Cut(Li, (®)E2) Ei[(®)E2/-|, 

if X is freshly and logically introduced in L2 and Li R((«)Lo) all 2 , Lq 
Logical step: 

Cut(R((2)Lo), (®)L(®, El, (j/)E 2 )) 

-s- Cut(Cut(Ei, (z)Eo), (y)L 2 ), 
if X is freshly introduced in L(a;, Ei, (y)L2)) 



where 

Ax(®)[E/a;] = E 
Ax{y)[L/x] = Ax{y),y^ x 

\-(x, E', {z)L")[L/x] = Cut(E, (a;)L(ai, E'[E/ai], {z)L”[L/x\) 
^\y,L' , \z)L")[L/x] = \-{y,L'[L/x\,{z)L''[L/x]),y ^ X 
R{{y)L')[L/x] = R{{y)L'[L/x\) 

Cut(E', {y)L")[L/x\ = Cut(E'[E/*], {y)L"[L/x]) 

Ax{y)[{x)L/-] = L[y/x] 

L[y,L',{z)L")\[x)L/-] = L{y,L' ,{z)L"[{x)L/-]) 
R{{y)L')[{x)L/-] = Cut(R((y)E'),(®)E) 
Cut(E',(j/)E")[(®)E/-] = Cut(E',(j/)E''[(*)E/-j) 
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Roughly, the protocol works as follows: a cut is firstly permuted upwards 
through its right subderivation (structural step S'!) and then through its left 
subderivation (structural step S 2 ) until it becomes a logical cut, to which the 
logical step applies, giving rise to new cuts of lower degree. A logical cut is a cut 
whose both cutformulas are freshly and logically introduced, i.e. introduced by a 
logical rule {L D or R d) without implicit contraction. An equivalent description 
of step Si (respec. step 52 ) is: to push the left (respec. right) subderivation 
upwards through the “tree of ancestors” fP of the right (respec. left) cutformula. 

The operations L 2 [Li/x] and Li[(x)L 2 /—] implement the structural steps 
Si and S 2 , respectively, and are inspired in the operations of substitution and 
co-substitution defined by Urban and Bierman in H21. 

3 HJ^ and the A^-Calculus 

We refer to Table El for the definition of HJ^ and X\j. The motivation for these 
systems rests in the following observations. 

The “life cycle” of a cut in LJT has three stages. It starts as a mid-cut 
and the first stage is a upwards permutation through its right subderivation, 
performed by rules A4i and A5j. The goal is to generate head-cuts (see rule 
A41). The operation subst performs this permutation in a single step. In doing 
so, cuts of the form l{x := u} become “internal” to this process and hence are 
not needed in the syntax. Now observe that in LJT such pe rmutation of a mid- 
cut can complete only if, in its course, we do not need to permute this mid-cut 
with another cut. This is why, in the definition of subst, extra clauses occur 
corresponding to the permutations 

(A44) (tl){x := u} — >■ t{x := v}l{x := v} , 

(A45) t{y := u}{x := u} — >■ t{x := v}{y := u{x := u}} . 

Let us return to the head-cuts generated by the first stage. Notice that in a 
head-cut vl,iil^ [] then its right cutformula is freshly and logically introduced. 
Such a cut is permuted upwards through its left subderivation by the rules A21 
and A3i, generating along the way All-redexes, i.e. logical cuts in the LJ* sense. 
The last stage of the “life cycle” of these logical cuts is All-reduction, by which 
cuts of 1 ower degree are generated. 

Again the operation insert performs in a single step the permutations of the 
second stage and cuts W become “internal” and thus superfiuous in the syntax. 
Extra clauses in insert’s definition correspond to the permutations 

(A22) {tl')l ^ t{l'l) , 

(A23) {t{x := v})l — >■ {tl){x := v} . 

Define LJT+ as LJT plus the four new reduction rules just presented. We 
leave to the reader the formalisation of the obvious relations between LJT, 
LJT+ and HJ+. 

On the other hand, it should be clear that reductions -^m{id) and -^h(ead) in 
HJ^ have a strong connection with the structural steps Si and S 2 , respectively. 
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Table 4. HJ^ and All-calculus 



w, V, t ::= xl I \x.t \ tl \ t{x := u} 



Der F; A\- I : B 

^^r-,Ah[]:A r,x: A--h xl-. B 

T ^ F', — \- t : A B; B \- I : C p F, x : A-, — \- t : B ± p 
F-ADB\-t-.:l-.C T;- h Xx.t :AdB^^^ 



mid — cut 'f~ V '■ A F,x : A-, \~t:B j p 

IILl/LL i-.ljLL j-i I j. r 1 T~> Ju 'f. F 



head — cut 



r ; — h t{x := v} : B 

F;-\-t: A F-A\-l:B 
F-.-\-tl: B 



(mid) t{x -.= w} — >■ subst{v, x, t) 

(head) tl — >■ insert(l,t), if f is not a A-abstraction or 1 = [] 

(log) (Xx.t){u :: 1) t{x := u}l 

where 



subst(v, X, ®[]) = V 
subst(v, X, xl) = V subst(v, x,l),l [] 
subst(v, x, yl) — y subst(v, x,l),y x 
subst(v,x,\y.t) = \y.subst(v,x,t) 

subst(v,x,tl) = subst(v,x,t)subst(v,x,l) 
subst(v, x, t{y := u}) = subst(v, x, t){y := subst(v, x , «)} 

subst(v,x,u :: 1) = subst(v,x,u) :: subst(v,x,l) 
subst(v, X, []) = [] 

insert([],t) = t 

insert(l , xl') = x append(l' , 1) , I / [] 
insert (I, Xx.t) = (Xx.t)l, I [] 
insert(l,tl') = t append(l' , 1) , I ^ [] 
insert(l,t{x := u}) = insert(l,t){x := w}, I / [] 

append(t :: I, I') = t :: append(l, I') 
append{[], I') = I' 





Revisiting the Correspondence between Cut Elimination and Normalisation 607 



of LJ* and that, roughly (but not exactly), a mid-cut is a ^i-redex and a head- 
cut is a S' 2 -redex. This is formalised by defining a map ~p : HJ'^ — ?> LJ^ as in 
Table ini (where z ^ FV{1) in the second and last clauses of the definition of p). 



Table 5. Translations p and p 



p(®D) 

p{x(t :: Z)) 
p{Xx.t) 
^t{x :^w}) 
p{tl) 



Ax(x) 

L(3:,p(t), (z)p(zl)) 
R((®)p(^)) 

Cut(p(n),(a;)p(Z)) 
Cut(p(Z), {z)p{zl)) 



_ ^(Ax(a;)) = x[] _ 
^(R((a:)L)) = Xx.p{L) 
p(L(*, Ti, iy)L2)) = ^{L2){y 
^(Cut(Li, (t/)L2)) = ^{L2){y 



Av{Li)]} 

P(ii)} 



Lemma 1. p{subst{v, x,t)) = p{t)[p{v) / x] , if x ^ FV{v). 

Proposition 1. If t t' in HJ^ then either p{t) -^Si p{t') or p{t) = p{t') 
in LJ*. 

The single anomaly is a mid-step of the form 

{xl){x := u} vl , (1) 

where x ^ FV{1), which collapses in LJ* because p{{xl){x := u}) = p{vl). There 
is another (and last) anomaly, this time regarding head-cuts. The term t[] is a 
S'l-redex and not a S' 2 -redex. We can split -^h as 

(Zii) t\\ — >■ t 

(Z 12 ) tl — ?> insert{l,t), if t is not a A-abstraction and I [] 

and then: 

Lemma 2. p{insert{l,t)) =p{t)[{z)p{zl)/—], if z ^ FV{1) and Z yf []■ 

Proposition 2. If t — t' in HJ^ then p(t) -^Si pi^') in LJ*, z = 1,2. 
Finally: 

Proposition 3. If t -Aiog t' in HJ^ then p{t) -A Log ~p{t') in L,P . 

Corollary 1. The typable terms of X'jj are strongly normalising. 

Proof. By strong normalisation of the tq-protocol. Propositions II and the 
fact that there can be no infinite reduction sequence starting from a A^-term 
and only made of steps (P) . □ 

In the following it will be useful, namely for proving confluence of A^, to 
consider a translation p ■. L,P ^ HJ^ , as defined in Table 0 
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Table 6. HJ and Air-calculus 



u, V, t ::= xl I \x.t \ {\x.t){v :: 1) 

1,1' ■- 



Ax 



r-Ah[]:A 



Der 



r-,A\- I -.B 
r, X : A-, — \- xl : B 



L D 



r--\-t-.A r-B\-i-.c 



r,x ■. A', — \- 1 ■. B 



^ ^ R ^ ^ , X . n b . u J p 

r-ADB\-t-.-.i-.c r--v- Xx.t -.adb^^^ 

beta - cut r,x : A-,-\- t B B'-hy-.A r-,B\-l:C ^ 



T;- h {Xx.t){v ■.:l)-.C 



'-xir 



{Ph) {Xx.t){v :: 1) — >■ insertil, (subst{v,x,t)) 
where 



substiv, X, xl) = insert {subst{v, x, 1), v) 
subst(v,x,yl) = y substiv, x,l), y ^ x 
subst{v,x,Xy.t) = Xy.subst(v,x,t) 

subst{v,x, {Xy.t){u :: 1)) = {Xy.subst[v,x,t)){substiv,x,u) :: subst[v,x,l)) 

subst{v, x,u :: 1) = subst(v,x,u) :: subst(v,x,l) 
subst{v, X, []) = [] 

insert{W,t) = t 

insertil, xl') = x appendil' , 1) , I ^ [] 
insertil, Xx.t) = iXx.t)iu :: I'), I = u :: I' 
insertil, iXx.t)iu :: I')) = iXx.t) (w :: append^' , 1)) , I 7 ^ [] 

appendit :: 1,1') = t appendH,l') 
appendil], I') = I' 



4 HJ and the A//-Calculus 

H J (see Table 0 was obtained by simplifying H J+ in such a way that the new 
calculus could be proved isomorphic to N J by means of functions W, O extending 
those defined in P] between cut-free LJT and normal NJ. 

The first thing to do is to get rid of mid-cuts and This requires that log 
becomes 



iXx.t)iu :: 1) — >■ substiu,x,t)l . 



( 2 ) 
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However this is not enough. One problem is that we would have 0(t[]) = 0{t) 
and thus 0 would not be injective. Hence we must require I ^ ^ in every head- 
cut tl. The second problem is that W{M) will be /i-normal, for all A-term M. 
This requires two measures: (a) We restrict ourselves to /i-normal terms. When 
mid-cuts are dropped, t{u :: 1) is /i-normal iff t is a A-abstraction. Thus head- 
cuts are required to be of the restricted form {Xx.t){u :: 1). (b) We drop -^h 
and have to reduce immediately, by performing insert, the ft.-redexes generated 
in (|2I). Now subst{u,x,t)l can itself be a /i-redex and the ft,-redex ul' may be 
created at subformulas of t of the form xl' . This explains the first clause in the 
new definition of subst in HJ and the new version of which we call Ph- 
Every A//-term is a A J-term and next proposition says that Ph is a packet 
of several steps of reduction in HJ~^ and, indirectly, in the tq-protocol. 

Proposition 4. If t — t' in HJ then t — ^ t' in HJ^ . 

Conversely, there is a translation {P)~ : HJ^ ^ HJ defined by: 

{xl)~ = xl~ , 

{Xx.t)~ = Xx.t~ , 

{tl)~ = insert{l~ ,t~) , 

{t{x \= v})~ = subst{v~ ,x,t~) , 

{u :: l)~ = u~ :: l~ , 

([])- = D • 



Define p as the restriction oi p to HJ and ip = {-) o ip. These p, ip extend 
those of 12]. 

Proposition 5. Tp op = id. 



Corollary 2. The typable subcalculs of A^ is confluent. 

Proof. Since the typable subcalculus of Aj is strongly normalising, it suffices, by 
Newman’s Lemma, to prove uniqueness of normal forms. Suppose t — >* ti, t 2 and 
that both ti are cut-free. By the simulation results above, p{t) — >■* p{ti),p{t 2 ) 
and, since ~p preserves cut-freeness, both ~p{ti) are cut-free. As p preserves typa- 
bility, p(ti) and ^(^ 2 ) are obtained within the tq-protoc ol and, by confluence 
of the tq-protocol, p{ti) = p{t 2 )- Now crucially ti,t 2 G HJ because ti,t 2 are 
cut-free. Then, ti = p{p{ti)) = p{p{ti)) = p{p{t 2 )) = tp(p{t 2 )) = t 2 - □ 

Now let us turn to the relation between HJ and N J . It is convenient to give 
the syntax of A-calculus as 



M, N ::= x\ Xx.M \ app(A) 

A ::= xMl(Xx.M)NlAM . 



Translations T and 0 between H J and N J are given in Table Q 
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Table 7. Translations •f' and O 



'P{x) = *[] 


0{x\\) = X 


<I'{Xx.M) = Xx.>PM 


0{x{u :: 1)) = 0'{x0u,l) 


<P{app{A)) = 'PfA, 0) 


0(Xx.t) = Xx.0t 

0{(Xx.t){u :: 1)) = 0'{{Xx.0t)0u,l) 


If'{xM,l) = x{9M :: 1) 


= 


^'{{Xx.M)NJ) = {Xx.^M){I>N :: 1) 


0'{A, []) = app{A) 


<P'{AM,l) = 'P'{A,I>M :: 1) 


0'{A,uy. 1) = 0'{A0u,l) 



Proposition 6. 0 o ^ = id and d' o 0 = id. 

Proof. Extend the proof in |2| • □ 



Lemma 3. d'{M[N/x\) = subst{d'N,x,'PM). 



Corollary 3. 0{subst{v,x,t)) = 0t\0vjx\. 

The promised isomorphism of normalisation procedures is the following 

Theorem 1. 

1. If M M' in NJ then <PM WM' in HJ. 

2. If t — t' in HJ then 0t 0t' in NJ. 

Hence (3 and !3h are isomorphic, but j3 performs normalisation in N J whereas 
Ph performs cut elimination in HJ. 

5 Further Work 

There are two main directions of further work. 

First, to extend this work to the other connectives of intuitionistic predicate 
logic. Challenging seems to be the positive fragment and the treatment in this 
framework of the anomalies caused by disjunction and reported in US 

Second, to generalise the whole enterprise to classical logic. The key players 
should be Herbelin’s LKT and A^-calculus |7|, Parigot’s natural deduction and 
A^-calculus 0 and an appropriate LK^. We plan to report on this in P). 
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Abstract. Equational formulae are first-order formulae over an alpha- 
bet T of function symbols whose only predicate symbol is syntactic 
equality. Unification problems are an important special case of equa- 
tional formulae, where no universal quantifiers and no negation occur. 
By the negation elimination problem we mean the problem of deciding 
whether a given equational formula is semantically equivalent to a unifi- 
cation problem. This decision problem has many interesting applications 
in machine learning, logic programming, functional programming, con- 
strained rewriting, etc. In this work we present a new algorithm for the 
negation elimination problem of equational formulae with purely exis- 
tential quantifier prefix. Moreover, we prove the coNP-completeness for 
equational formulae in DNF and the TTj -hardness in case of CNF. 



1 Introduction 

Equational formulae are first-order formulae over an alphabet T of function sym- 
bols whose only predicate symbol is syntactic equality. Unification problems are 
an important special case of equational formulae, where no universal quantifiers 
an no negation occur. By the negation elimination problem we mean the problem 
of deciding whether a given equational formula is semantically equivalent to a 
unification problem. 

The so-called complement problems constitute a well-studied special case of 
the negation elimination problem, i.e.: given terms t, si . . . , s„ over a signature 
J-, does there exist a finite set {ri, . . . , rm\ of terms over iF, s.t. every iF-ground 
instance of t which is not an instance of any term Si is an instance of some term 
rj, and vice versa. Note that such terms ri, . . . , exist, iff negation elimination 
is possible from the equational formula (3a:)(Vy)[z = t A t yf si A . . . A t yf s„], 
where the variables x in t and the variables y in the terms Si are disjoint. Com- 
plement problems have many interesting applications in machine learning, logic 
programming, functional programming, etc (cf. PI). In constrained rewriting, 
constraints are used to express certain rule application strategies (cf. p|). Due 
to the failure of the critical pair lemma, one may eventually have to convert the 
constraints into equations only. Deciding whether equivalent equations exist, 
again corresponds to the negation elimination problem. 

For the negation elimination problem in general (cf. 1 1 tij . P|) and for com- 
plement problems in particular (cf. |S|), several decision procedures have been 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 612-ESI 2000. 
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presented, which all have a rather high computational complexity. In El, the 
coNP-hardness of negation elimination from complement problems was shown. 
This result was later extended in m to the coNP-completeness of these prob- 
lems. The hardness proof in m also showed that negation elimination is at least 
as hard to decide as (un-)satisfiability. Hence, by the non-elementary complex- 
ity of the satisfiability problem (cf. ini) we know that the negation elimination 
problem for arbitrary equational formulae is also non-elementary recursive. 

Our main objective in this work is to device an efficient algorithm for the 
negation elimination problem of purely existentially quantified equational formu- 
lae. Note that these simple formulae are the target of the transformations given 
in 0 and for equational formulae in arbitrary form. Hence, an efficient nega- 
tion elimination algorithm for this special case is also important for the general 
case of negation elimination from arbitrary equational formulae. Moreover, we 
prove the coNP-completeness in case of DNF with purely existential quantifier 
prefix and the ill-hardness in case of CNF, respectively. 

This paper is organized as follows: After recalling some basic notions in Sect. 
0 we shall start our investigation of the negation elimination problem by con- 
sidering existentially quantified conjunctions of equations and disequations in 
Sect.Ol In Sect. 0 and 0 this analysis will be extended to formulae in DNF and 
CNF, respectively. Finally, in Sect. 0 we give a short conclusion. Due to space 
limitations, proofs can only be sketched. The details are worked out in E|. 

2 Preliminaries 

An equational formula over an alphabet T of function symbols is a first-order 
formula with syntactic equality “=” as the only predicate symbol. Throughout 
this paper, we only consider the case where T is finite and contains at least one 
constant symbol (i.e.: a function symbol with arity 0) and at least one proper 
function symbol (i.e.: with arity greater than 0), since otherwise the negation 
elimination problem is trivial. A disequation s t is a short-hand notation for 
a negated equation -i(s = t). The trivially true formula is denoted by T and the 
trivially false one by T. An interpretation is given through a ground substitution 
a over J-, whose domain coincides with the free variables of the equational for- 
mula. The trivial formula T evaluates to true in every interpretation. Likewise, 
T always evaluates to false. A single equation s = t is validated by a ground 
substitution cr, if sa and ta are syntactically identical. The connectives A, V, 

3 and V are interpreted as usual. A ground substitution cr which validates an 
equational formula V is called a solution of V. In order to distinguish between 
syntactical identity and the semantic equivalence of two equational formulae, we 
shall use the notation “=” and respectively, i.e.: V = Q means that the two 
formulae V and Q are syntactically identical, while V ~ Q means that the two 
formulae are semantically equivalent (i.e.: they have the same set of solutions). 
Moreover, by P < Q we denote that all solutions of V are also solutions of Q. 
We shall sometimes use term tuples as a short-hand notation for a disjunction 
of disequations or a conjunction of equations, respectively, i.e.: For term tuples 
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s = (si, . . . , Sfc) and t = (ti, . . . , t/^), we shall abbreviate “si = ti A . . . A 
and “si ^ V . . . V Sfe ^ to “s = t" and “s ^ t”, respectively. Note that 
conjunctions of equations and disequations can be easily simplified via unifi- 
cation, i.e.: Let d = {z\ <— ri,...,Zn ■<— r„} be the mgu of s and t. Then 
s = t is equivalent to zi = ri A . . . A = r„. Likewise, s ^ t is equivalent to 
zi ^ ri V . . . V Zn Tn- As a short-hand notation, we shall write Egrt('(9) and 
Disequ{'&), for these simplified equations and disequations, respectively. 

An implicit generalization over a signature .7^ is a construct of the form 
I = t/ti V . . . V with the intended meaning that I represents all ground term 
tuples over T, that are instances of t but not of any tuple ti (cf. 0). In contrast, 
an explicit generalization is a disjunction riV. . .Vr^ of term tuples over iF, which 
contains all ground term tuples over T, that are an instance of at least one tuple 
Ti. A ground term tuple s = (si, . . . , Sk) is an instance of the implicit generaliza- 
tion / = t/ti V . . .V<m, iff the ground substitution a = {z\ ->r- si, . . . , ■<— s^} is 

a solution of the equational formula V = (3a:) (Vy) [z = t t\t ^ Si t\ . . . t\t ^ Sn], 
where the variables x occurring in t and the variables y occurring in the tuples 
ti are disjoint. Moreover, the question as to whether the ground term tuples 
contained in I can be represented by an explicit generalization is equivalent to 
the negation elimination problem of V . 

A term t is called linear, iff it has no multiple variable occurrences. Moreover, 
a term tt) is a linear instance of t, iff all terms in the range ry(i?) are linear and for 
all variables x,y in Yar(t) with x ^ y, xt} and yd have no variables in common. 
By the domain closure axiom, every ground term over T is an instance of the 
disjunction \J f{xi, . . . ,Xa(f)), where the Xi’s are pairwise distinct variables 
and a{f) denotes the arity of /. In |S|, this fact is used to provide a representation 
of the complement of a linear instance td of t (i.e.: of all ground term tuples 
that are contained in t but not in td). If every term tuple ti on the right-hand 
side of an implicit generalization / — t / tid . . .d t^ is a linear instance of t, then 
this representation of the complement immediately yields an equivalent explicit 
representation E of I, i.e.: Let Pi — {pn, . . . ,Pini \ denote the complement of ti 
w.r.t. t and suppose that the terms Pij are pairwise variable disjoint. Then I is 
equivalent to E — • ■ • V^™=i where mgi denotes the 

most general instance (cf. flj. Proposition 3.4 and Corollary 3.5). 

In a representation of the complement of an instance td w.r.t. t is provided 
also for the case where td is not necessarily linear. The idea of this representa- 
tion is to construct the instances ta of t, which are not contained in td, in the 
following way: Consider the tree representation of d, ’’deviate” from this rep- 
resentation at some node and close all other branches of a as early as possible 
with new, pairwise distinct variables. Depending on the label of a node, this 
deviation can be done in two different ways: If a node is labelled by a function 
symbol from T (note that constants are considered as function symbols of arity 
0), then this node has to be labelled by a different function symbol from T . If 
a node is labelled by a variable which also occurs at some other position, then 
the two occurrences of this variable have to be replaced by two fresh variables 
X, y and the constraint x ^ y has to be added. However, if a node is labelled by 
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a variable which occurs nowhere else, then no deviation at all is possible at this 
node. For our purposes here it suffices to know that the complement of a term 
tuple t-& w.r.t. t can be represented by a finite set of constrained term tuples 
P = {[Pi ■ ^i],---APn ■ ^n]}, where [pt : Xi] denotes the set of all ground 
instances Pia of Pi, s.t. cr is a solution of the equational formula Xi. Moreover, 
the number of elements in P is quadratically bounded and the size of each con- 
strained term tuple [pi : X^] is linearly bounded w.r.t. the size of the tuples t and 
td. Finally, the constraints Xi are either disequations or the trivially true for- 
mula T. Note that this representation of the complement can be easily extended 
to implicit generalizations, namely: Let I = t/t'di V ... V t'dm be an implicit 
generalization. Then the complement of I (i.e. the set of ground term tuples 
over T which are not contained in T) can be represented by P U {t'di , . . . , i'dm}, 
where P is a representation of the complement of t. 

Let S = {si, . . . , s„} be a a set of terms over T. Then the set of all com- 
mon ground terms of these terms can be computed via unification, namely: Let 
si, . . . , Sn be pairwise variable disjoint and let /r = mgu(si , . . . , s„) denote the 
most general unifier of these terms. Then sip contains exactly those ground 
terms over P, which are contained in all terms s^. Similarly, the common ground 
instances of constrained term tuples [pi : Xi ], . . . , : X„] can be represented 

by [pip : Zp], where p = mgu{pi, . . . ,p„) denotes the most general unifier and 
Z = X\ t\ ... t\ Xn. Moreover, if the constraints Xi are either disequations or of 
the form T, then such an intersection is non-empty, iff the m.gu p exists and Zp 
contains no trivial disequation of the form t ^ t (cf. Lemma 2). 

3 Conjunctions of Equations and Disequations 

Recall that the satisfiability problem of (existentially quantified) conjunctions 
of equations and disequations can be easily decided via unification, namely: Let 
V = {3x){e\ A . . . A Cfc A di A . . . A di), where the e^’s are equations and the diS 
are disequations. Then V is satisfiable, iff the equations are unifiable and the 
application of the most general unifier p = m.gu{ei , . . . , Ck) to the disequations 
does not produce a trivial disequation dip of the form t t. In this section 
we show that also for the negation elimination problem of such formulae, there 
is an efficient decision procedure based on unification. To this end, we provide 
simplifications of the equations and of the disequations in the Lemmas 15.11 and 
U.'A respectively. In Theorem 15. 1 1 we shall prove that these simplifications are 
indeed all we need for deciding the negation elimination problem. 

Lemma 3.1. (simplification of the equations) Let V = (3a;) (ei A ... A A 

di A ... A di) be an equational formula over T , where the ei ’s are equations and 
the di ’s are disequations. Moreover, let z = {zi, . . . , Zn) denote the free variables 
occurring in V . Then V may be transformed in the following way: 

case 1 : If the equations e\, . . . ,Ck are not unifiable, or if ei, ... ,Ck are unifiable 
with p = mgu{ei, . . . , Ck), s.t. at least one disequation dip is trivially false (i.e.: 
it is of the form t ^ t for some term t), then V is equivalent to V' = T. 
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case 2 : If the equations ei,...,6k are unifiable with = mgu{ei, . . . ,6k) and 
none of the disequations dig is trivially false, then we define V' as follows: 
W.l.o.g. we assume that dom{g) = {zi,...,Za} for some a < n. Now let 
u = {uq,+It . ■ ,Un) be a vector of fresh, pairwise distinct variables and let 
V = {za+i ^ Ua+i, ■ ■ ■ ,Zn ^ Mrt}. Then V is equivalent to V' = (3a:) (3 m) ( 2 ; = 
witht= {zigv, . . . , Zagv,Ua+i, . . . ,Un). 

Proof. (Sketch): Case 1 corresponds to the satisfiability test mentioned above. 
The transformation in case 2 consists of several sub-steps, namely: Let x = 
(xi, . . . , Xm}- W.l.o.g. we assume that the mgu g is of the form g = {zi ^ 
ti, . . . , Za <— ta,xi ^ si, . . . jXg sg} for some (3 <m. Then, by the definition 
of m^w’s, V is equivalent to 72. = (3a;) = ^i)^l\\=i dih\- 

Moreover, for any equational formula Q and any variable u not occurring in Q, 
the equivalence Q ~ (3u)[z = u A Q{z ■!— m}] holds. Hence, 72 is equivalent to 
72' = (3a;)(3M)[A)L„+i(zi = M*)AAAi(^i = i^l^) = s^ly) A ALi ■ 

Finally, note that the xfs are existentially quantified variables which occur 
nowhere else in 72'. Hence, it can be easily shown that no solutions are added to 
72', if we delete the equations xt = SiV (for details, see P3|). □ 

Lemma 3.2. (simplification of the disequations) Let V = (3a:) ( 2 ; = t A 
Ai=i equational formula that results from the transformation according 

to case 2 from Lemma, l,V. /I above (i.e., in particular, the free variables z of V 
neither occur in t nor in the disequations). Then, every disequation dt can be 
further transformed as follows: 

case 1 : If the equation ->di is not unifiable, then di is equivalent to T and may 
therefore be deleted. 

case 2 : Otherwise, let m.gu{->di) = Di = {t>i ^ s\,. . . A- s.y}. 

case 2.1 : If domfdi) U Var(r^(di)) C Yar(t) (i.e.: the Vj’s and all variables in 
the terms Sj also occur in t), then we replace di by T)isequ{Di) = Vj=i fo A Sj. 

case 2.2 : If domfdi) UYar(jg{Di)) 2 Yar{€), then di may be deleted. 

Proof. (Sketch): The correctness of the cases 1 and 2.1 is clear by the definition 
of mgu's. W.l.o.g. we assume that the disequations di,...,ds for some 6 < I 
may be deleted via case 2.2. Moreover, let y C a; denote those variables from 
X which do not occur in t and, for every i S {1, . . . , ^}, let A denote a 
disequation in I)isequ{Di) which contains a variable from y. Then the equivalence 
(3y)Q A Ai=i fo'i A Sji ~ Q holds for any equational formula Q that contains 
no variable from y. Moreover, Vj^ A -^ji ^ A holds for every i G {1, . . . , 5}. We 
thus have V = (3a;) ( 2 ; = t A A\=i di) < (3a:) ( 2 ; = t A Ai=s+i di) ~ (3a;) (2 = 

* A ALi fol A Sj, A A\=s+i dz) < V. Hence, V and (3a:) ( 2 ; = t A Ai= 5 +i di) are 
indeed equivalent. □ 

An equational formula of the form V = (3a;) [ 2 ; = f A Ai=i T>isequ{Di)] corre- 
sponds to the implicit generalization I = t/{tDi V ... V tDi) (cf. Sect. I3). In 
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particular, negation elimination from V corresponds to the conversion of / into 
an explicit generalization. Hence, one way to decide the negation elimination 
problem for formulae resulting from the simplifications of the Lemmas rm and 
m is to apply the algorithm from 0 (which has an exponential worst case com- 
plexity) to the corresponding implicit generalization. In the following theorem 
we claim that (due to the special form of the disequations), deciding the negation 
elimination problem for such formulae is in fact much cheaper than this. 

Theorem 3.1. (negation elimination from simplified conjunctions) Let 

V = {3x)'[z = t A /\ .^^ Dfsegu(i?i)] be an equational formula which results 
from the transformation of Lemm.a. Vi. ?l followed by Lemma. I,V. ‘A Then negation 
elimination from V is possible, iff for all i G {1, ...,/}, VarfrglpDi)) = 0 holds. 

Proof. (Sketch): The “if” -direction follows immediately from the correspondence 
between the equational formula V = (3a:) [z — THsequ{'di)] and the im- 

plicit generalization I = t/^tdiV. . .Vtdi). In particular, a disequation Disequ{i}i) 
for which Var{Tg{'&i)) = 0 holds, corresponds to a linear instance tdi of t. For 
the “only if” -direction, it is shown in m that the implicit generalization I 
corresponding to V can be split into disjoint generalizations Ji,/2,... via the 
complement of the linear term tuples tdi on the right-hand side of I, s.t. there is 
at least one implicit generalization Ij = Sj/(sj? 7 ji V . . . V ) where all term 
tuples SjTjjk are non-linear instances of Sj. By Proposition 4.6 from 0, we may 
then conclude that negation elimination is impossible for the equational formula 
corresponding to Ij and, therefore, also for P. □ 

Note that the transformations in the Lemmas I3. II and [t.2l above as well as test- 
ing the condition from Theorem I.3. II can of course be done in polynomial time, 
provided that terms are represented as directed, acyclic graphs (cf. [1 2 \ ) . In mi, 
a similar result was proven for the negation elimination from quantifier-free con- 
junctions of equations and disequations. To this end, the notion of “effectively 
ground” disequations was introduced, i.e.: Let d be a disequation and let p de- 
note the To.gu of the equations. Then d is effectively ground, iff the mgu X of 
-•dp is a ground substitution. A disequation d, which may be deleted without 
changing the meaning of the equational formula, is called redundant. In par- 
ticular, d is redundant, if -<dp is not unifiable. Then it is shown in mi, that 
negation elimination from a quantifier-free conjunction of equations and dise- 
quations is possible, iff every non-redundant disequation is effectively ground. 
In other words, by the transformations from the Lemmas o and o we have 
extended the notion of “effectively ground” disequations to the case where exis- 
tentially quantified variables are allowed. In the following example, we put these 
transformations together with Theorem EH to work: 

Example 3.1. Let V = (3xi,X2,X3) [g{zi,X2) = g{f{xi),f{x3)) A Z2 yf A 
f{z2) /(a) A xa y^ zi] be an equational formula over T = {a,f,g}. Then 

the mgu p of the equations has the form p = {zi A- f{xi),X2 ^ 7(2:3)} and, 
therefore, P is equivalent to P' = (33:1,2:2,2:3) [(^1,3:2) = (/(xi), /(xa)) A Z2 y^ 
f{xi) A f{z2) yf /(a) A 3:3 yf /(2:i)] . In order to bring the free variable Z2 to the 
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left-hand side of the equations, we define the substitution v = {z2 ^ U2} accord- 
ing to Lemma o Then V is equivalent to V” = (Bxi,X2,X3,U2) [(zi, Z2, X2) = 
(f(xi),U2,f(x3))Au2 ^ f{xi)Af{u2) ^ f{a)Ax3 ^ f{xi)]. Finally, by Lemma 
o, we may delete the equation X2 = f{x3). We thus get P'" = (3xi,X2,X3,U2) 
[(21, Z2) = if{xi),U2) Au 2^ f{xi) A f{u2) yf /(a) A X3 ^ f{xi)] . 

By Lemma E3 we may transform the disequations in the following way: 
f{u2) yf /(a) may be simplified to U2 ^ a. X3 ^ f{xi) may be deleted (due to 
the presence of the variable X3, which does not occur any more in the equations). 
Finally, U2 yf f(xi) is left unchanged. Hence, the original formula P is equivalent 
to P= (3 xi,X2,U2) [(zi, 22) = (/(xi), U2) A M2 yf f{xi) A M2 yf a] . By Theorem 
10, negation elimination from P (and, therefore, also from P) is impossible, 
since the disequation M2 yf f{xi) in P is based on a non-ground substitution. 

4 Equational Formulae in DNF 

The algorithms in P| and P for solving equational formulae result in the trans- 
formation of an arbitrary equational formula into a so-called “definition with 
constraints” , which is basically an existentially quantified equational formula 
in DNF. As far as the satisfiability of equational formulae is concerned, such 
a transformation is indeed all we need. Note that the equational formula P = 
{3x)T>i V ... V T>n, where the T>iS are conjunctions of equations and disequa- 
tions, is satisfiable, iff at least one of the subformulae {3x)T>i is satisfiable. As 
has already been pointed out in Sect. El the latter condition can be tested very 
efficiently via unification. In this section we extend the simplifications for con- 
junctions of equations and disequations from the previous section to equational 
formulae in DNF. In lyemma. H. I I below. we provide a transformation which may 
be applied to disequations of the form T)isequ{'dij) with )) + 0- In 

Theorem 14. 1 1 we shall then show that either this transformation allows us to 
eliminate all disequations that are based on non- ground substitutions, or nega- 
tion elimination is impossible. Unfortunately, this algorithm has exponential 
complexity. However, by the coNP-completeness shown in Theorem H.'Zl we can 
hardly expect to find a significantly better algorithm. 

Lemma 4.1. (simplification of a DNF) Let P be an equational formula with 
P = (3a;) [z ^ ti A P>isequ{-dij)] V . . . V [z = A P>isequ{-dnj)] , s.t. 
each disjunct ofP has already been simplified via Lem,m,a \S. 1\ followed, by Lemma 
EH Let A = V ... V tidii.) be the implicit generalization corresponding 

to the i-th disjunct of P and let Pi = {[pn : X^i], . . . , : Xim^]} denote 

the complement of p. Moreover, suppose that all term tuples Pja and tidijs are 
pairwise variable disjoint. Finally, for every dij withWar{rg{dij)) yf 0, we define 
A{i,j) as the following set of substitutions: 

4(f,j) — { A I (3oi, . . . , O^i — I , ,...,0:7,), s.t. 

p ~ Xagu{ti‘d ij ,P1 q, 7, . . . ^ P(i—l)ai-i^ P{i+l)ai3.it • • ■ 5 Pnoin ) ^xists 

and (Aiaj A ... A A A ... A X„a„)p contains 

no trivial disequation and A = p\var{rg{-&ij)) } 
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If A{i,j) contains only ground substitutions, then the disequation T)isequ{'dij) 
may be replaced by the conjunction ^^T)isequ{'dijX) of dis equations. 

Proof. (Sketch): We make again use of the correspondence between implicit gen- 
eralizations and equational formulae. In particular, replacing the disequation 
Disequ{'&ij) by the conjunction A) corresponds to the re- 

placement of the term tuple tiDij on the right-hand side of li by the disjunction 
of term tuples VAG/i(i j) U'dijX. Now let I = /i V . . . V denote the disjunction 
of implicit generalizations corresponding to V and let I' = /( V . . . denote the 
disjunction of implicit generalizations corresponding to the formula V' which 
results from a simultaneous application of the above replacement rule to all 
disequations Tiisequfdij). We have to show that then X and X' are equivalent. 
Actually, X Q X' clearly holds, since every tuple U-dijX is an instance of Udij 
and, therefore, A C /' trivially holds for every i £ { 1 , . . . ,n}. In order to see 
that X' f-X holds as well, note that every implicit generalization /' is basically 
obtained from li by restricting the terms to those instances, which are not 
contained in the remaining implicit generalizations Ij. The correctness of this 
restriction is due to the relation [A — B]U C = [A — {B — C)] U C, which holds 
for arbitrary sets A, B and C. □ 

The transformation from Lemma id.ll is illustrated in the following example: 
Example j.l. Let T = {a,g} and let V = { 3 xi,X 2 ,X 3 ){X>i V T>2 V P3) with 

T>1 = {Zi,Z2, Zz) = {xi,X2, g{xi)) hXi^X2 
T>2 = (21, ^2, Zz) = {g{xi),g{x2),xz) A X2 yf a 
TXz = {zi,Z2,zz) = {g{xi),X2,xz) Axz yf g'^{xi), 

In order to transform the disjuncts 2 ?i, X>2 and X>z via Lemma^Hl we consider 
the corresponding implicit generalizations I\, I2 and Iz'. 

h = {xi,X 2 ,g{xi))/{x 2 ,X 2 ,g{x 2 )) 
h = {g{xi),g{x2),xz)/{g{xi),g{a),xz) 
h = { 9 {xi),X 2 ,xz)/{g{xi),X 2 ,g‘^{xi)) 

These implicit generalizations have the following complement representations 
P\,P2 and P3, respectively (note that we may omit the constraint “T” from 
tuples of the form [t : T]. Moreover, we rename the variables apart: 

Pi = {{yii,yi2,a), [(2/11,2/12,5(2/13) : 2/11 7^ 2/13], (2/11,2/11,5(2/11))} 

P2 = {(a, 521, 522), (521, a, 522), (5(521), 5(0), 522)} 

Pn = {(0,531,532), ( 5 ( 53 i), 532 , 5 ^( 53 i))| 

The disequation X2 ^ am. T>2 cannot be further simplified, since i?2i = |a^2 ^ a} 
already is a ground substitution. So we only have to transform the disequations 
Xi y^ X2 in T>i and Xz yf g^{xi) in T>z, respectively. The set A(l, 1 ) of substitu- 
tions which have to be applied to x\ X2 in T>i can be computed as follows. By 
Mc(2.a3 we denote the mgu of {x2,X2, g{x2)) with the 0:2-th term tuple from P2 
and the 03-th term tuple from P3. Likewise, by Aaa.aa we denote the restriction 
of /iaa.aa lo the variable X2, which is the only variable occurring in the range of 
1^11 = |a;i ^ X 2 }. 
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Mi,i = mgM((j;2,a;2,g(a;2)), (a, 1/21,1/22), (a, 2/31,2/32)) = 

{ X2 0,2/21 ^ 0,2/22 ^ <7(0), 2/31 ^ 0,2/32 ^ 2/(0)} ^1,1 = {2^2 ^ 0} 

M2,1 = mgu((x2,X2, g(x2)), (2/21, o, 2/22), (a, 2/31,2/32)) ^ A2,i = {x2 ^ 0} 

M3, 2 = mgu((x2,X2, g(x2)), (5(2/21), 5(0), 2/22), (5(531), 2/32, 5^(531))) 

A3, 2 = {2:2 ^ 5(0)} 

Hence, by Lemma 14.11 the disequation Disequ(j9u) = x\ ^ X2 in T>i may be 
replaced by the conjunction Dise5M(dn o{a;2 ^ o}) ADise5M(dno{x2 ^ 5(0)}), 
i.e.: T>i may be transformed into = (zi,Z2,Z3) = (xi, X2, g(xi)) A (xi,ai2) yf 
(0,0) A (xi,X2) yf (5(0), 5(0)). 

Now we compute the set yl(3, 1) of substitutions which have to be applied to 
the disequation X3 yf g^(xi) in V3. Again we use the notation /iai.aa to denote 
the mgu of (5(0:1), ^2, 5^(2:i)) with the oi-th term tuple from Pi and the a2-th 
term tuple from P2. Likewise, by Ao,^,q 2 we denote the restriction of /iai,a2 to 
the variable x\. 

M2, 2 = mgM((5(a:i),a;2,5^(xi)), (511,512,5(513)), (521,0,522)) = 

{ 3:2 ^ O, 5ii ^ 5(2^1), 512 ^ O, 5i3 ^ g{xi), . . .} A2,2 = {} 

M2,3 = m5M((5(a:i),a;2,5^(xi)),(5ii,5i2,5(5i3)),(5(52i),5(a),522)) =» A2,3 = {} 

M3,3 = m5M((5(a:i), 02, 5^(2;i)), (511, 5n, 5(5n)), (5(521), 5(0), 522)) 

=> A3, 3 = {o;i t- 0} 

Note that the substitution A2,2 = A2,3 = {} does not have to be added to 
A(3, 1), since the application of /i2,2 and M2, 3, respectively, to the constraint 
5ii y^ 5 i 3 produces a trivially false disequation, i.e.: (511 yf 5 i3)m 2,2 = (511 y^ 
5 i 3)M2,3 = 5(2:1) yf 5(2:1). But then A(3, 1) contains only the ground substitution 
A3, 3 = {x\ a} and, therefore, the disequation T)isequ{‘d3i) = X3 ^ 9^{xi) in 
T>3 may be replaced by T)isequ{'d3i o {x\ ->r- o}) = (a:i,a;3) yf (o,5^(o)). Hence, 
we get = {zi,Z2,Z3) = (5(^1), 0:2, X3) A (a:i,a:3) yf (o,5^(a)). 

Analogously to Theorem ft. II it can be shown that the transformation of the 
disjuncts in a DNF according to Lemma^HIis actually all we need for a negation 
elimination procedure. Moreover, a non-deterministic version of this algorithm 
will allow us to derive the coNP-completeness result in Theorem n:zi 

Theorem 4.1. (negation elimination from simplified formulae in DNF) 

Let V = (3a;) [2; = ti A A/=i D2segrt(i?i/)] V ... V [2: = t„ A A/”=i L)isequ{'dnj)] 
be an equational formula in DNF, s.t. first eaeh disjunct has been transformed 
via Lemma Vi. 11 followed by Lemma Vi. ‘A and then Lemma o has been applied 
simultaneously to all disequations. Then negation elimination from V is possible, 
iff Yarffgfdij)) — 0 for every i G {1 , . . . , n} and every j G {1, . . . ,k}. 

Proof. (Sketch): Exactly like in Theorem 13.11 the “if” -direction follows from the 
correspondence between implicit generalizations and equational formulae. The 
“only if” -direction is more involved. For details, see nn. □ 

Theorem 4.2. (coNP-completeness) The negation elimination problem of 
existentially quantified equational formulae in DNF is coNP- complete. 
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Proof. (Sketch): As for the coNP-membership, let V = {3x)Vi V ... V be 
an equational formula in DNF. W.l.o.g. we may assume that the polynomial 
time transformations from the Lemmas rm and EH have been applied to each 
disjunct Vi. Hence, in particular, Vi is of the form [z = ti A A^i P>isequ{'dij)] . 
Then we can check via a non-deterministic version of Lemma 14 . IL that negation 
elimination from V is impossible, namely: Guess a tuple ti, a substitution -dij 
and indices «i, . . . , 0!i_i, Oi+i, . . . , o;„ and check that the resulting substitution 
A G A(i,j) from Tyemma, l 4 . 1 l is non-ground. 

For the coNP-hardness proof, recall that the emptiness problem of implicit 
generalizations is coNP-complete (cf., e.g., 0 , p], ^ or P 3 ). i.e.: Let T be 
a signature and let Si, . . . , s„ and t be term tuples over T , s.t. every Si is an 
instance of t. Does the implicit generalization I = t/(si V ... V s„) contain no 
J^-ground instance? Now let an instance of the emptiness problem of implicit 
generalizations be given through the term tuples Si = (sn, . . . , sifc), . . . , Sn = 
(srii, . . . , Snk) and t = (ti, . . . , tk), s.t. every tuple Si is an instance of t. Moreover, 
let X denote a vector of variables, s.t. all variables in t and in any tuple Si are 
contained in x. Finally let u,v, z\, . . . , Zk+2 be fresh, pairwise distinct variables 
and let 2: be defined as z = {zi, . . . , Zfe+2). Then we define the formula V in DNF 
asV = ( 3 a;)( 3 M,u) [{z = {ti, . . . ,tk,u,v) Au ^ V Vr=i ^ ■ • -,Sik,u,u)]. 

In it is shown that the implicit generalization / = t/(siV...Va„)is empty, 
iff negation elimination from V is possible. □ 



5 Equational Formulae in CNF 

A straightforward negation elimination algorithm for purely existentially quan- 
tified equational formulae in CNF consists of a transformation from CNF into 
DNF followed by our algorithm from the previous section. Of course, in the worst 
case, this transformation into DNF leads to an exponential blow-up. However, 
by the il2-hardness shown in Theorem IP. 1 1 we cannot expect to do much better 
than this anyway. 

Theorem 5 . 1 . ( 7 T 2 -hardness) The negation elimination problem of existen- 
tially quantified equational formulae in CNF is U^-hard. 

Proof. (Sketch): Recall the well-known I?2-hard problem 3QSAT2 (= quan- 
tified satisfiability with two quantifier alternations, cf. ISl), i.e.: Given sets 
P = {pi, . . . ,pk} and R = {ri, . . . , r;} of propositional variables and a Boolean 
formula E = {In A I12 A I13) V ... V (Z„i A ln2 A Ina) with propositional variables 
in PUR, is the quantified Boolean sentence {3P){\/R)E satisfiable? Now let 
a G J- denote an arbitrary constant symbol. Then we reduce such an instance of 
the 3QSAT2 problem to the complementary problem of the negation elimination 
problem in the following way: 

V = ( 3 a;) [(dll V di2 V di3 V Zk+i yf Zk+2) A ... A (d„i V d„2 V d„3 V Zk+i yf Zk+2)], 
where 2: = (zi, . . . , 2^+2) denotes the free variables in V, x is of the form 
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X = {xi, . . . ,Xi) and the dij's are defined as follows: 

{ z-y 7 ^ a if lij is an unnegated propositional variable G P 
z-y = a if lij is a negative propositional literal ~^p-y for some Py G P 
Xy a if lij is an unnegated propositional variable Vy G R 
Xy = a if lij is a negative propositional literal -<ry for some Vy G R 

It can be shown that negation elimination from V is impossible, iff (3P)(Vi?)if 
is satisfiable (see |E])- □ 

Unfortunately, it is not clear whether the 7T2-membership also holds. The obvi- 
ous upper bound on the negation elimination problem of equational formulae in 
CNF is coNEXPTIME (i.e.: transform the CNF into DNF and apply the coNP- 
procedure from the previous section) . An exact complexity classification in case 
of CNF has to be left for future research. 



6 Conclusion 

In this paper we have presented a new negation elimination algorithm for purely 
existentially quantified equational formulae. The main idea of our approach 
was an appropriate extension of the notion of “effectively ground” disequations, 
which was given in CD for the case of quantifier-free conjunctions of equations 
and disequations. In case of conjunctions of equations and disequations with 
purely existential quantifier prefix, we were thus able to decide negation elimi- 
nation in polynomial time rather than by the decision procedure from , which 
has exponential complexity. For formulae in DNF our algorithm actually has ex- 
ponential complexity. However, in general, it is still considerably more efficient 
than the algorithm from m for deciding whether the corresponding disjunction 
I = /i V . . . V /n of implicit generalizations has a finite explicit representation. 
This can be seen as follows: The algorithm from uni has two sources of exponen- 
tial complexity: One comes from a transformation rule which basically allows us 
to restrict the term tuples on the right-hand side of an implicit generalization 
li to the complement of another implicit generalization Ij . This transformation 
is very similar to our transformation in Lemma H.l I Moreover, the algorithm 
from m contains another rule, which allows us to transform a single implicit 
generalization R in exactly the same way as the algorithm from |H|. Again, our 
algorithm uses the cheap transformations from the Lemmas l.'I. II and m rather 
than the algorithm from 0 . Together with the transformations from | 2 ] and 
of arbitrary equational formulae into existentially quantified ones in DNF, our 
algorithm from Sect. 0can be seen as a step towards a more efficient negation 
elimination procedure for the general case. 

For existentially quantified formulae in DNF we have provided an exact com- 
plexity classification of the negation elimination problem by proving its coNP- 
completeness. In case of CNF, we have left a gap between the ilf lower bound 
and the coNEXPTIME upper bound for future research. 
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Abstract. We consider a restricted version of the general Set Covering 
problem in which each set in the given set system intersects with any 
other set in at most 1 element. We show that the Set Covering problem 
with intersection 1 cannot be approximated within a o(logn) factor in 
random polynomial time unless NP C ZTI M We also 
observe that the main challenge in derandomizing this reduction lies in 
finding a hitting set for large volume combinatorial rectangles satisfy- 
ing certain intersection properties. These properties are not satisfied by 
current methods of hitting set construction. 

An example of a Set Covering problem with the intersection 1 property is 
the problem of covering a given set of points in two or higher dimensions 
using straight lines; any two straight lines intersect in at most one point. 
The best approximation algorithm currently known for this problem has 
an approximation factor of 0(logn), and beating this bound seems hard. 
We observe that this problem is Max-SNP-Hard. 



1 Introduction 

The general Set Covering problem requires covering a given base set B of size 
n using the fewest number of sets from a given collection of subsets of B. This 
is a classical NP-Complete problem and its instances arise in numerous diverse 
settings. Thus approximation algorithms which run in polynomial time are of 
interest. 

JohnsonP2| showed that the greedy algorithm for Set Cover gives an 0(log n) 
approximation factor. Much later, following advances in Probabilistically Check- 
able Proofs Pj, Lund and Yannakakis [El and Bellare et al. |Z] showed that 
there exists a positive constant c such that the Set Covering problem can- 
not be approximated in polynomial time within a clogn factor unless NP Q 
£ir/A/A(nOP°s'°s")). Feige jE] improved the approximation threshold to (1 — 
o(l))logn, under the same assumption. Raz and Safra^ni and Arora and Su- 
dan jS] then obtained improved Probabilistically Checkable Proof Systems with 
sub-constant error probability; their work implied that the Set Covering problem 
cannot be approximated within a c log n approximation factor (for some constant 
c) unless NP = P. 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 624-E33 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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Note that all the above hardness results are for general instances of the Set 
Covering problem and do not hold for instances when the intersection of any pair 
of sets in the given collection is guaranteed to be at most 1. Our motivation for 
considering this restriction to intersection 1 arose from the following geometric 
instance of the Set Covering problem. 

Given a collection of points and lines in a plane, consider the problem of 
covering the points with as few lines as possible. Megiddo and Tamir |TT)| showed 
that this problem is NP-Hard. Hassin and Megiddo^J showed NP-Hardness 
even when the lines are axis-parallel but in 3D. The best approximation factor 
known for this problem is 6>(logn). Improving this factor seems to be hard, and 
this motivated our study of inapproximability for Set Covering with intersection 
1. Note that any two lines intersect in at most 1 point. 

The problem of covering points with lines was in turn motivated by the prob- 
lem of covering a rectilinear polygon with holes using rectangles EE|. This prob- 
lem has applications in printing integrated circuits and image compression |0|. 
This problem is known to be Max-SNP-Hard even when the rectangles are con- 
strained to be axis-parallel. For this case, an 0(\/log n)-factor approximation 
algorithm was obtained recently by Anil Kumar and Ramesh|2|. However, this 
algorithm does not extend to the case when the rectangles need not be axis- 
parallel. Getting a o(log n)-factor approximation algorithm for this case seems 
to require solving the problem of covering points with arbitrary lines, though we 
are not sure of the exact nature of this relationship. 

Our Result. We show that there exists a constant c > 0 such that ap- 
proximating the Set Covering problem with intersection 1 to within a factor of 
clog n in random polynomial time is possible only if iVP C ZT I M 
(where ZTIME{t) denotes the class of languages that have a probabilistic algo- 
rithm running in expected time t with zero error) . We also give a sub-exponential 
derandomization which shows that approximating the Set Covering problem with 
intersection 1 to within a factor of in deterministic polynomial time is 

possible only if NP C DTIMEiT^^ '), for any constant e < 1/2. 

The starting point for our result above is the Lund-Yannakakis hardness 
proof^SI for the general Set Covering problem. This proof uses an auxiliary set 
system with certain properties. We show that this auxiliary set system necessarily 
leads to large intersection. We then replace this auxiliary set system by another 
carefully chosen set system with additional properties and modify the reduction 
appropriately to ensure that intersection sizes stay small. The key features of 
the new set system are partitions of the base set into several sets of smaller size 
(instead of just 2 sets as in the case of the Lund-Yannakakis system or a constant 
number of sets as in Feige’s system; small sets will lead to small intersection) 
and several such partitions (so that sets which “access” the same partition in 
the Lund-Yannakakis system and therefore have large intersection now “access” 
distinct partitions). 

We then show how the new set system above can be constructed in random- 
ized polynomial time and also how this randomized algorithm can be derandom- 
ized using conditional probabilities and appropriate estimators in 0(2" ) time. 
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where e is a positive constant, specified in Sectional This leads to the two condi- 
tions above, namely, NP C DTIME{2'^ ) (but for a hardness of O( io^gi^g„ )) 
and NP C A deterministic polynomial time construction 

of our new set system will lead to the quasi-NP-Hardness of approximating the 
Set Covering problem with intersection 1 to within a factor of clogn, for some 
constant c > 0. 

While the Lund-Yannakakis set system can be constructed in deterministic 
polynomial time using e-biased limited independence sample spaces, this does 
not seem to be true of our set system. One of the main bottlenecks in construct- 
ing our set system in deterministic polynomial time is the task of obtaining a 
polynomial size hitting set for Combinatorial Rectangles, with the hitting set sat- 
isfying additional properties. One of these properties (the most important one) 
is the following: if a hitting set point has the elements i,j among its coordinates, 
then no other hitting set point can have both i,j among its coordinates. The 
only known construction of a polynomial size hitting set for combinatorial rect- 
angles is by Linial, Luby, Saks, and Zuckerman m and is based on enumerating 
walks in a constant degree expander graph. In the full version of this paper, we 
show that the hitting set obtained by m does not satisfy the above property 
for reasons that seem intrinsic to the use of constant degree expander graphs. 

In the full version, we also note that if the proof systems for NP obtained 
by Raz and Safra|T^ or Arora and Sudan|^ have an additional property then 
the condition NP C can be improved to NP = ZPP. 

Similarly, the statement that approximating the Set Covering problem with in- 
tersection 1 to within a factor of c lo'g log „ in deterministic polynomial time is 

possible only if NP C DTIMEiZ^^ ') can be strengthened to approximation 
factor clogn instead of The property needed of the proof systems is 

that the degree, i.e., the total number of random choices of the verifier for which 
a particular question is asked of a particular prover, be O(n^), for some small 
enough constant value 5. The degree influences the number of partitions in our 
auxiliary proof system and therefore needs to be small. It is not clear whether 
existing proof systems have this property Enj- 

The above proof of hardness for Set Covering with intersection 1 does not 
apply to the problem of covering points with lines, the original problem which 
motivated this paper; however, it does indicate that algorithms based on set 
cardinalities and small pairwise intersection alone are unlikely to give a o(log n) 
approximation factor for this problem. 

Further, our result shows that constant VC-dimension alone does not help 
in getting a o(log n) approximation for the Set Covering problem. This is to be 
contrasted with the result of Bronnimann and Goodrich Q which shows that 
if the VC-dimension is a constant and an O(^) sized (weighted) e-net can be 
constructed in polynomial time, then a constant factor approximation can be 
obtained. 

The paper is organized as follows. Section El will give an overview of the 
Lund-Yannakakis reduction. Section El shows why the Lund-Yannakakis proof 
does not show hardness of Set Covering when the intersection is constrained to 
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be 1. Section 0 describes the reduction to Set Covering with intersection 1. This 
section describes a new set system we need to obtain in order to perform the 
reduction and shows hardness of approximation of its set cover, unless NP C 
Section 0 will sketch the randomized construction of this 
set system. Section sketches the sub-exponential time derandomization, which 
leads to a slightly different hardness result, unless NP C DTIME{2^^ '), e < 
1/2. Section Q enumerates several interesting open problems which arise from 
this paper. 



2 Preliminaries: The Lund-Yannakakis Reduction 

In this section, we sketch the version of the Lund-Yannakakis reduction described 
by Arora and Lund |3|. The reduction starts with a 2-Prover 1-Round proof 
system for Max-3SAT(5) which has inverse poly logarithmic error probability, 
uses O(lognloglogn) randomness, and has O(loglogn) answer size. Here n is 
the size of the Max-3SAT(5) formula T . Arora and Lund|2| abstract this proof 
system into the following Label Cover problem. 

The Label Cover Problem. A bipartite graph G having n' + n' vertices and 
edge set E is given, where n' = n^fiogiogn)^ vertices have the same degree 
deg, which is polylogarithmic in n. For each edge e G E, a, partial function 
/e : [<f| — > [d'\ is also given, where d > d' , and d,d' are polylogarithmic in n. 
The aim is to assign to each vertex on the left, a label in the range 1 . . .d, and 
to each vertex on the right, a label in the range 1 ... d/ so as to maximize the 
number of edges e = (u,v) satisfying fe{label{u)) = label{v). Edge e = {u,v) is 
said to be satisfied by a labelling if the labelling satisfies fe{label{u)) = label{v). 

The 2-Prover 1-Round proof system mentioned above ensures that either all 
the edges in G are satisfied by some labelling or that no labelling satisfies more 
than a fraction of the edges, depending upon whether or not the Max- 

38 AT(5) formula T is satisfiable. Next, in time polynomial in the size of G, 
an instance SC of the Set Covering problem is obtained from this Label Cover 
problem EC with the following properties: if there exists a labelling satisfying 
all edges in G then there is a set cover of size 2n', and if no labelling satisfies 
more than a ^ fraction of the edges then the smallest set cover has size 
l7(2n'logn'). The base set in SC will have size polynomial in n' . It follows that 
the Set Covering problem cannot be approximated to a logarithmic factor of the 
base set size unless NP C 

Improving this condition to NP — P requires using a stronger multi-prover 
proof system j I Dpt)] which has a constant number of provers (more than 2), 
0(log n) randomness, 0(log log n) answer sizes, and inverse polylogarithmic error 
probability. The reduction from such a proof system to the Set Covering problem 
is similar to the reduction from the Label Cover to the Set Covering problem 
mentioned above, with a modification needed to handle more than 2 provers 
(this modification is described in |Z|). 
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In this abstract, we will only describe the reduction from Label Cover to 
the Set Covering problem and show how we can modify this reduction to hold 
for the case of intersection 1. This will show that Set Covering problem with 
intersection 1 cannot be approximated to a logarithmic factor unless NP C 
The multi-prover proof system of the previous paragraph 
with an additional condition can strengthen the latter condition to NP — ZPP\ 
this is described in the full version. 

We now briefly sketch the reduction from an instance £C of Label Cover to 
an instance SC of the Set Covering problem. 

2.1 Label Cover to Set Cover 

The following auxiliary set system given by a base set N = {1 . . . n'} and its 
partitions is needed. 

The Auxiliary System of Partitions. Consider d! distinct partitions of N 
into two sets each, with the partitions satisfying the following property: if at 
most 1H£IL 30|;g in all are chosen from the various partitions with no two sets 
coming from the same partition, then the union of these sets does not cover 
N . Partitions with the above properties can be constructed deterministically in 
polynomial time m- Let pi , P^ respectively denote the first and second sets 
in the ith partition. We describe the construction of SC next. 

Using P^s to construct SC. The base set B for SC is defined to be 
{{e,i)\e G E,1 < i < n'}. The collection C of subsets of B contains a set 
C(v,a), for each vertex v and each possible label a with which v can be labelled. 
If u is a vertex on the left, then for each a, 1 < a < d, C{v,a) is defined as 
{(e,i)\e incident on v A i € Pj And if z; is a vertex on the right, then for 
each a, 1 < a < d', C{v,a) is defined as {(e,z)|e incident on v Ai G P^}. 

That SC satisfies the required conditions can be seen from the following facts. 

1. If there exists a vertex labelling which satisfies all the edges, then B can be 
covered by just the sets C{v, a) where a is the label given to v. Thus the size 
of the optimum cover is 2n' in this case. 

2. If the total number of sets in the optimum set cover is at most some suitable 

constant times n'logn', then at least a constant fraction of the edges e = 
( m , v) have the property that the number of sets of the form C{u^ *) plus the 
number of sets of the form C{v, +) in the optimum set cover is at most . 
Then, for each such edge e, there must exist a label a such that C(zt, a) and 
C{v, fe{a)) are both in this optimum cover. It can be easily seen that choosing 
a label uniformly at random from these sets for each vertex implies that there 
exists a labelling of the vertices which satisfies an „ fraction 

of the edges. 
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3 SC Has Large Intersection 

There are two reasons why sets in the collection C in SC have large intersections. 

Parts in the Partitions are Large. The first and obvious reason is that 
the sets in each partition in the auxiliary system of partitions are large and 
could have size therefore, two sets in distinct partitions could have Q{n') 
intersection. This could lead to sets C{v,a) and C{v,b) having Q{n') common 
elements of the form (e,i), for some e incident on v. 

Clearly, the solution to this problem is to work with an auxiliary system 
of partitions where each partition is a partition into not just 2 large sets, but 
into several small sets. The problem remains if we form only a constant number 
of parts, as in m- We choose to partition into {n')^ sets, where e is some 
non-zero constant to be fixed later. This ensures that each set in each partition 
has size 0{{n'Y polylog(n)) and that any two such sets have 0(1) intersection. 
However, smaller set size leads to other problems which we shall describe shortly. 

Functions fg{) are not 1-1. Suppose we work with smaller set sizes as 
above. Then consider the sets C{v, a) and C{v, b), where f is a vertex on the left 
and a, b are labels with the following property: for some edge e incident on v, 
/e(a) = fe{b). Then each element (e, *) which appears in C(y, a) will also appear 
in C{v, 6), leading to an intersection size of up to n{{n'Y *deg), where deg is the 
degree of v in G. This is a more serious problem. Our solution to this problem is 
to ensure that sets C{v,a) and C{v,b) are constructed using distinct partitions 
in the auxiliary system of partitions. 

Next, we describe how to modify the auxiliary system of partitions and the 
construction of SC in accordance with the above. 

4 CC to SC with Intersection 1 

Our new auxiliary system of partitions V will have d' * {deg -I- 1) * d partitions, 
where deg is the degree of any vertex in G. Each partition has m = {n'Y~^ 
parts, for some e > 0 to be determined. These partitions are organized into d' 
groups, each containing {deg -I- 1) * d partitions. Each group is further organized 
into deg + 1 subgroups, each containing d partitions. The first m/2 sets in each 
partition comprise its left half and the last m/2 its right half. 

Let Pg.s,p denote the pth partition in the sth subgroup of the gth group and 
let Pg^s,p,k denote the fcth set (i.e., part) in this partition. Let Bj. denote the set 
^g,s,pPg,s,p,k if 1 < A: < m/2, and the set ydg^sPg,s,i,k^ if w/2 < k < m. We also 
refer to as the kih. column of V. 

We need the following properties to be satisfied by the system of partitions 
V. 

1. The right sides of all partitions within a subgroup are identical, i.e., Pg^s,p,k = 

Pg,s,i,k, for every k > m/2. 

2. P{g,s,p,k) n P{g' , s' ,p' ,k) = (j) unless either g = g',s = s' ,p = p' , or, 

k > m/2 and g = g' , s = s'. In other words, no element appears twice 
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within a column, modulo the fact that the right sides of partitions within a 
subgroup are identical. 

3. \Bk n Bk' \ < 1 for all k, k', 1 < fc, fc' < m, fc yf k' . 

4. Suppose N is covered using at most /3m log n' sets in all, disallowing sets on 
the right sides of those partitions which are not the first in their respective 
subgroups. Then there must be a partition in some subgroup s such that the 
number of sets chosen from the left side of this partition plus the number of 
sets chosen from right side of the first partition in s together sum to at least 
|m. 

e and /3 are constants which will be fixed later. Let Ap ^ = ^g^sPg,s,p,ki 
for each p,k, 1 < p < d,l k < raj2. Let = UsPg^s,i,k, for each g,k, 

I < g < d' , m/2 + 1 < k < m. Property 2 above implies that: 

5. n = 0 for all p ^ p', where 1 < p,p' < d and k < m/2. 

6. \Dg^k n Dg/^k \ = 0 for all g yf g', where I < g,g' < d' and k > m/2. 

We will describe how to obtain a system of partitions V satisfying these 
properties in Section 0 and Section 0 First, we show how a set system SC with 
intersection 1 can be constructed using V. 

4.1 Using "P to Construct SC 

The base set B for SC is defined to be {(e, i)|e £ if, 1 < i < n'} as before. This 
set has size (n')^ * deg = 0((n')^ polylog(n)). 

The collection C of subsets of B contains m/2 sets Ci(u, a) . . .Cm/ 2 {v,a), 
for each vertex v on the left (in graph G) and each possible label a with which v 
can be labelled. In addition, it contains m/2 sets ■ ■ ■ Cm{v,a), for 

each vertex v on the right in G and each possible label a with which v can be 
labelled. These sets are defined as follows. 

Let Ey denote the set of edges incident on v in G. We edge-colour G using 
deg + 1 colours. Let col{e) be the colour given to edge e in this edge colouring. 
For a vertex v on the left side, and any number k between 1 and m/2, G^iv, a) = 
UeG E„{(e, i)\i £ Pf^{a),coi(e),a,k}- For a vertex v on the right side, and any number 
k between m/2 -|- 1 and m, Gk{v, a) = UeG_E„ {(e,z)|i £ Pa,col(e),l,k}- 

We now give the following lemmas which state that the set system SC has 
intersection 1 and that it has a set cover of small size if and only if there exists 
a way to label the vertices of G satisfying several edges simultaneously. The 
hardness of approximation of the set cover of SC is given in Corollary d whose 
proof will appear in the full version. 

Lemma 1. The intersection of any two distinct sets Gk{v,a) and Ck'{w,b) is 
at most 1. 

Proof. Note that for \Gk{v,a) r\Gk'{w,b)\ to exceed 1, either v,w must be iden- 
tical or there must be an edge between v and w. The reason for this is that each 
element in Gk{v,a) has the form (e, *) where e is an edge incident at v while 
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each element in Ck'{w, b) has the form (e', *), where e' is an edge incident at w. 
We consider each case in turn. 

Case 1. Suppose v = w. Then either k ^ k' ov k = k' , a ^ b. 

First, consider Ck{v, a) and Ck'{v, b) where k ^ k' and u is a vertex in the left 
side. If a = 5, observe that Ck{v, a) fl Ck'{v, a) = (j). So assume that a ^ b. The 
elements in the former set are of the form (e, i) where i € Pf^{a),coi{e),a,k and the 
elements of the latter set are of the form (e,j) where j G Pf^(b),coi{e),b,k' ■ Note 
that Ue(^E,Pf,(a),coi{e),a,k ^ Bk and yJe(^E,Pf,{b),coi(e),b,k' ^ Bk' . By Property 3 
of V, the intersection Bk,Bk’ is at most 1. However, this alone does not imply 
that Ck{v,a) and Ck'{v,b) have intersection at most 1, because there could 
be several tuples in both sets, all having identical second entries. This could 
happen if there are edges ei, 62 incident on v such that /e, (a) = /ej (a), /e, (b) = 
fe 2 (b) and there had been no colouring on edges. Property 2 and the fact that 
co/(ei) yf col{e 2 ) for any two edges 61,62 incident on v rule out this possibility, 
thus implying that \Ck{v, a) fl Ck'{v, 6)| <1. The proof for the case where u is a 
vertex on the right is identical. 

Second, consider C'k{v,a) and Ck{v,b), where u is a vertex on the left and 
a ^ b. Elements in the former set are of the form (e,i) where e is an edge 
incident on v and i G Pf^{a),coi{e),a,k- Similarly, elements in the latter set are of 
the form (e,j) where j G Pf^(b),coi(e),b,k- Note that UeGB„-P/,(a).coi(e),a,fc C Aa,k 
and AeeE^Pfe(b),coi{e),b,k C Ab^k- The claim follows from Property 5 in this case. 

Third, consider Ck{v, a) and Ck{v, b), where u is a vertex on the right, a ^ b, 
and k > ml 2. Elements in the former set are of the form (e,i) where e is an 
edge incident on v and i G Pa,coi(e),i,k- Similarly, elements in the latter set are 
of the form (e,j) where j G Pb,coi(e),i,k- Note that Ae(^E^Pa,coi(e),i,k C T>a.fe and 
Ae^E^Pb,coi{e),i,k C The claim follows from Property 6 in this case. 

Case 2. Finally consider sets Ck{v,a) and Ck'{w,b) where e = {v,w) is an 
edge, V is on the left side, and w on the right. Then Ck(v, a) contains elements 
of the form (e',i) where i G Pf^,(a),coi(e'),a,k- Ck'{w,b) contains elements of the 
form (e',j) where j G Pb,coi{e'),i,k' ■ The only possible elements in Ck{v,a) fl 
Ck'{w,b) are tuples with the first entry equal to e. Since Pf^(a),coi{e),a,k C Bk 
and Pb,coi{e),i,k' Q Bk' and k < m/2, k' > m/2, the claim follows from Properties 
2 and 3 in this case. 

Lemma 2. If there exists a way of labelling vertices of G satisfying all its edges 
then there exists a collection ofn'm sets in C which covers B. 

Proof. Let label{v) denote the label given to vertex v by the above labelling. Con- 
sider the collection C' C C comprising sets C\{v,label{v)) . . . ,Cr^{v,label{v)) 
for each vertex v on the left and sets C™+i(w, label{w )) . . . , Cm{w, label{w)) for 
each vertex w on the right. We show that these sets cover B. Since there are 
m/2 sets in C' per vertex, \C'\ = 2n' * 'f = n'm. 

Consider any edge e = {v,w). It suffices to show that for every i, 1 < 
i < n', the tuple (e,i) in B is contained in either one of C\{v,label{v)) . . . , 
Cm {v, label (v)) or in one of Cm+i{w,label{w))...,Cm{w,label{w)). The key 
property we use is that fe{label{v)) = label{w). 
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Consider the partitions Pf,{iabei(v)),coi(e),iabei(v) and Piabei(w)),coi{e),i- Since 
fe{label{v)) = label{w), the two partitions belong to the same group and sub- 
group. Since all partitions in a subgroup have the same right hand side, the 
element i must be present either in one of the sets Piabei(w),coi{e),iabei(v),ki where 
k < m/2, or in one of the sets Piabei{w),coi(e),i,ki where k > m/2. We consider 
each case in turn. 

First, suppose i € Piabei(w),coi{e),iabei{v),k, for some k < m/2. Then, from 
the definition of Ck{v,label{v)), (e,i) € Ck(v, label(v)). Second, suppose i € 
Piabei(w),coi(e),i,kj for some k > m/2. Then, from the definition of Ck{w, label{w)), 
(e,i) € Ck(w, label(w)). The lemma follows. 

Lemma 3. If the smallest collection C of sets in C covering the base set B has 
size at most ^n'mlogn' then there exists a labelling of G which satisfies at least 
a 32^2 „/ fraction of the edges. Recall that [3 was defined in Property 4 ofV. 

Proof. Given C' , we need to demonstrate a labelling of G with the above prop- 
erty. For each vertex v, define L{v) to be the collection of labels a such that 
Ck{v, a) G C" for some k. We think of L{v) as the set of “suggested labels” for v 
given by C' and this will be a multiset in general. The labelling we obtain will 
ultimately choose a label for v from this set. It remains to show that there is 
a way of assigning each vertex v a label from L{v) so as to satisfy sufficiently 
many edges. 

We need some definitions. For an edge e = {v,w), define #(e) = \L{v)\ + 
\L{w)\. Since the sum of the sizes of all L{v)s put together is at most |n'm log n' 
and since all vertices in G have identical degrees, the average value of #(e) is 
at most ^mlogn'. Thus half the edges e have #(e) < (3m log n'. We call these 
edges good. 

We show how to determine a subset L'{v) of L{v) for each vertex v so that 
the following properties are satisfied. If v has a good edge incident on it then 
L'{v) has size at most 4/3 log n'. Further, for each good edge e = (v,w), there 
exists a label in L'{v) and one in L'{w) which together satisfy e. Clearly, random 
independent choices of labels from L'{v) will satisfy a good edge with probability 
16^2 iog2 , implying a labelling which will satisfies at least a 32/32 iog2 fraction 
of the edges (since the total number of edges is at most twice the number of 
good edges), as required. 

For each label a G L(v), include it in L'{v) if and only if the number of sets 
of the form Gt.{v,a) in C' is at least m/4. Clearly, \L'{v)\ < = 4/31ogn', 

for vertices v on which good edges are incident. It remains to show that for 
each good edge e = (v,w), there exists a label in L'(v) and one in L'{w) which 
together satisfy e. 

Consider a good edge e = (u, w). Using Property 4 of V, it follows that there 
exists a label a G L(v) and a label b G L{w) such that the fe(a) = b and the 
number of sets of the form C.^{v, a) or Ct^{w, b) in C' is at least 3m/4. The latter 
implies that the number of sets of the form C,(r!, a) in G' must be at least m/4, 
and likewise for Ct{w,b). Thus a G L'{v) and b G L'{w). Since /e(o) = b, the 
claim follows. n 
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Corollary 1. Set Cover with intersection 1 cannot be approximated within a 
factor of in random polynomial time, for some constant /3, 0 < /3 < g, 

unless NP C ZT I M . Further, if the auxiliary system of partitions 

V can be constructed in deterministic polynomial (in n' ) time, then approximat- 
ing to within a factor is possible only if NP = 

5 Randomized Construction of the Auxiliary System 7^ 

The obvious randomized construction is the following. Ignore the division into 
groups and just view P as a, collection of subgroups. For each partition which is 
the first in its subgroup, throw each element i independently and uniformly at 
random into one of the m sets in that partition. For each partition P which is 
not the first in its subgroup, throw each element i which is not present in any 
of the sets on the right side of the first partition Q in this subgroup, into one 
of the first m/2 sets in P. Property 1 is thus satisfied directly. We need to show 
that Properties 2,3,4 are together satisfied with non-zero probability. 

It can be shown quite easily that Property 4 holds with probability at least 
1 — (i)" ^ provided e > 22/3. Slightly weak versions of Properties 2 and 3 
(intersection bounds of 2 instead of 1) also follow immediately. This can be im- 
proved in the case of intersection 1 using the Lovasz Local Lemma, but this does 
not give a constant success probability and also leads to problems in derandom- 
ization. The details of these calculations appear in the full version. 

To obtain a high probability of success, we need to change the randomized 
construction above to respect the following additional restriction (we call this 
Property 7): each set Pg^s,p,k has size at most *{deg+i)*dn ^ g,s,p,k, 

^^g^d',l<s< deg 1,1 < p < d,l < k < m. 

The new randomized construction proceeds as in the previous random ex- 
periment, fixing partitions in the same order as before, except that any choice of 
throwing an element i G N which violates Properties 2,3,7 is precluded. Prop- 
erty 7 enables us to show that not too many choices are precluded for each 
element, and therefore, this experiment stays close in behaviour to the previous 
one (provided 22/3 < e < 1/2), except that Properties 2,3,7 are all automatically 
satisfied. The details appear in the full version. 

6 Derandomization in 0 ( 2 '^^ ') Time 

The main hurdle in derandomizing the above randomized construction in poly- 
nomial time is Property 4. There could be up to ' ) 

ways of choosing /3m log n' sets from the various partitions in P for a constant 
e' slightly smaller than e, and we need that each of these choices fails to cover 
N for Property 4 to be satisfied. 

For the Lund-Yannakakis system of partitions described in Section t2. 1 1 each 
partition was into 2 sets and the corresponding property could be obtained deter- 
ministically using small-bias log n-wise independent sample space constructions. 
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This is no longer true in our case. Feige ’s uni system of partitions, where each 
partition is into several but still a constant number of parts, can be obtained 
deterministically using anti-universal sets ini- However, it is not clear how to 
apply either Feige’s modified proof system or his system of partitions to get 
intersection 1. 

We show in the full version that enforcing Property 4 in polynomial time cor- 
responds to constructing hitting combinatorial rectangles with certain restricted 
kinds of sets, though we do not know any efficient constructions for them. In 
this paper, we take the slower approach of using Conditional Probabilities and 
enforcing Property 4 by checking each of the above choices explicitly. However, 
note that the number of choices is superexponential in n (even though it is sub- 
exponential in n') . To obtain a derandomization which is sub-exponential in n, 
we make the following change in V'. the base set is taken to be of size n instead 
of n' . We use an appropriate pessimistic estimator and conditional probabilities 
to construct V with parameter n instead of n' (details will be given in the full 
version). This will give a gap of 6>(logn) (instead of 0(logn')) in the set cover 
instance SC). But since the base set size in SC is now 0((n' * n) polylog(n)), 
we get a hardness of only 0(logn) = 6*( iogiog„/ ) (note that the approximation 

factor must be with respect to the base set size) unless NP C DTIMEiT^'" '), 
for any constant e such that 22(3 < e < 1/2. 

7 Open Problems 

A significant contribution of this paper is that it leads to several open problems. 

1 . Is there a polynomial time algorithm for constructing the partition system 
in SectionEP In the full version, we show its relation to the question of construc- 
tion of hitting sets for combinatorial rectangles with certain constraints. Can 
a hitting set for large volume combinatorial rectangles, with the property that 
any two hitting set points agree in at most one coordinate, be constructed in 
polynomial time? Alternatively, can a different proof system be obtained, as in 
|iulj , which will require a set system with weaker hitting properties? 

2 . Are there instances of the problem of covering points by lines, with an 
integrality gap of 0(logn)? In the full version, we show that the an integrality 
gap of 2 and we describe a promising construction, which might have a larger 
gap. 

3 . Are there such explicit constructions for the the Set Covering problem 
with intersection 1? Randomized constructions are easy for this but we do not 
know how to do an explicit construction. 

4 . Is there a polynomial time algorithm for the problem of covering points 
with lines which has an o(log n) approximation factor, or can super-constant 
hardness (or even a hardness of factor 2) be proved? In the final version, we 
observe that the NP-Hardness proof of Megiddo and TamirlTBI can be easily 
extended to a Max-SNP-Hardness proof. 
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Abstract. This paper studies the approximability of the sparse fc-span- 
ner problem. An 0(log n)-ratio approximation algorithm is known for 
the problem for k = 2. For larger values of k, the problem admits only 
a weaker 0(n^'^ )-approximation ratio algorithm m. On the negative 
side, it is known that the fc-spanner problem is weakly inapproximable, 
namely, it is NP-hard to approximate the problem with ratio O(logn), 
for every k > 2 cn. This lower bound is tight for fc = 2 but leaves a 
considerable gap for small constants k > 2. 

This paper considerably narrows the gap by presenting a strong (or Class 
III |TD|) inapproximability result for the problem for any constant k > 2, 
namely, showing that the problem is inapproximable within a ratio of 
0(2*°®'"), for any fixed 0 < e < 1, unless NP C "). 

Hence the fc-spanner problem exhibits a “jump” in its inapproximability 
once the required stretch is increased from k — 2 to k — 2 + 5. 

This hardness result extends into a result of 0(2*°® "(-inapproximability 
for the fe-spanner problem for k = log^ n and 0 < e < 1 — p, for any 0 < 
p < 1. This result is tight, in view of the 0(2*°® ** "(-approximation ratio 

for the problem, implied by the algorithm of M for the case k — log^ n. 
To the best of our knowledge, this is the first example for a set of Class 
III problems for which the upper and lower bounds “converge” in this 
sense. 

Our main result implies also the same hardness for some other variants 
of the problem whose strong inapproximability was not known before, 
such as the uniform fc-spanner problem, the unit-weight fc-spanner prob- 
lem, the 3-spanner augmentation problem and the “all-server” fc-spanner 
problem for any constant k. 



1 Introduction 



The Sparse Spanner Problem 

Graph spanners have been intensively studied in the contexts of communica- 
tion networks, distributed computing, robotics and computational geometry 
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Consider an unweighted n- vertex graph G = (V,E). The dis- 
tance between two nodes u and v in G, denoted dist{u,v,G), is the minimum 
length of a path connecting u and u in G. A subgraph G' = (V, A') of G is a 
k-spanner if dist{u, v, G') < k ■ dist{u, v, G) for every u,v G V. We refer to A: as 
the stretch factor of G' . The sparsest k-spanner problem is to find a /c-spanner 
G' = {V,E') with the smallest edge set E'. 

The sparsest fc-spanner problem is known to be NP-hard HU- On the positive 
side, it is also shown in 1112111141 that for every integer fc > 1, every unweighted 
n— vertex graph G has a polynomial time constructible 0(fc)-spanner with at 
most edges. Hence in particular, every graph G has an O(logn) — 

spanner with 0{n) edges. These results are close to the best possible in general, 
as implied by the lower bound given in PH. 

The algorithm of HH provides us with a global upper bound for sparse k- 
spanners, which holds for every graph. However, for specific graphs, considerably 
sparser spanners may exist. Furthermore, the upper bounds on sparsity given 
by these algorithms are small (i.e., close to n) only for large values of k. It is 
therefore interesting to look for approximation algorithms, that yield a near- 
optimal /c-spanner for any given graph. 

In PH , a log l^-approximation algorithm was presented for the unweighted 
2-spanner problem. Also, since any fc-spanner for an n-vertex graph requires at 
least n — 1 edges, the results of umsm cited above can be interpreted as pro- 
viding an 0(n^/^)-ratio approximation algorithm for the unweighted fc-spanner 
problem. This implies that once the required stretch guarantee is relaxed, i.e., 
k is allowed to be large, the problem becomes easier to approximate. In par- 
ticular, at the end of the spectrum, the unweighted fc-spanner problem admits 
0(1) approximation once the stretch requirement becomes k = J7(logn). An- 
other particularly interesting intermediate point along this spectrum of k values 
is fc = 0(log^ n), 0 < p, < 1, where the above result implies 0(2^°® ^ ") approx- 
imation. We call this property ratio degradation. 

On the other hand, determining the hardness of approximating the fc-spanner 
problem was an open problem for quite a while. Recently, it was shown in El 
that it is NP-hard to approximate the problem by an O(logn) ratio for fc > 2. 
This type of l7(log n)-inapproximability is henceforth referred to as weak inap- 
proximahility. Hence the issue of approximability was practically resolved for 
fc = 2, but a considerable gap was left for small constants fc > 2. 

On the way to resolving it, a number of different versions of the problem, 
harder than the original one, were formulated and proved to be hard for approx- 
imation within certain factors. Specifically, HH introduced a certain generaliza- 
tion of the basic fc-spanner problem called the unit-length fc-spanner problem, 
and has shown that with constant stretch requirement fc > 5, this generalized 
problem is hard to approximate with 0(2^°®‘") ratio. This type of 17(2*°®'")- 
inapproximability is henceforth referred to as strong inapproximability. (In 1 1 1 )] . 
problems admitting strong inapproximability results of this type are called Class 
III problems.) The latter result of HH was later extended into hardness results 
for a number of other generalizations of the basic fc-spanner problem m How- 
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ever, the methods of Hi] fell short of establishing the strong inapproximability 
of the original (unweighted) /c-spanner problem, hence the exact approximability 
level of the original problem remained unresolved. 



Main Results 

In this paper we significantly narrow the gap between the known upper and 
lower bounds on the approximability of the unweighted fc-spanner problem. This 
is achieved by establishing a strong inapproximability result for the basic k- 
spanner problem for any constant stretch requirement k > 2. Hence the k- 
spanner problem exhibits a “jump” in its inapproximability once the required 
stretch is increased from k = 2 to k = 2 + S. Also our result establishes the strong 
hardness of the basic fc-spanner problem even when restricted to bipartite graphs. 

The hardness result shown in this paper extends also beyond constant k. We 
show that the basic fc-spanner problem with stretch requirement k = 0(log^ n), 

0 < /i < 1 can not be approximated with a ratio of 0(2*°® ”) for any 0 < e < 

1 — /i. Given that the fc-spanner problem with k = 0(log^ n), 0 < /i < 1 admits 
0(2*°®"^ '' approximation algorithm fl], our hardness result is tight for stretch 
requirements k = 0(log*^n), 0 < /i < 1. 

Let us remark that this result cannot be pushed much further, since as 
mentioned earlier, at the end of the spectrum the problem admits an 0(1)- 
approximation ratio once k = O(logn) [I4j . hence our result cannot be ex- 
tended to fc = log n. It also seems very difficult (if not impossible) to strengthen 
our result in the sense of providing an f2(n°)-inapproximability result for the 
basic spanner problem (or even to the unit- weight spanner problem), since 
in jS| we presented a reduction from the unit-weight spanner problem to the 
MMSA 3 problem (also known as the Red- Blue problem), and in turn, there is 
also a reduction from MMSA^ to the Label — Cover min problem HI- Hence 
an f?(n°)-inapproximability result for some c > 0 for one of these problems 
would imply that a polynomial-time automatic prover does not exist, since its 
existence implies that there is a polynomial approximation within any polyno- 
mial factor for propositional proof length. The existence of such a polynomial 
time automatic prover is a long-standing open problem in proof theory (cf. HI). 
Moreover, such an f?(n°) inapproximability result would imply a similar result 
for a long list of Class HI problems, including MMSA, LCMIN, Min-Length- 
Frege-Proof, Min-Length-Resolution-Refutation, AND/OR Scheduling, Nearest- 
Lattice- Vector, Nearest-Codeword, Learning-Halfspaces, Quadratic-Programm- 
ing, Max-TT-Subgraph, Longest-Path, Diameter-Subgraph and many others (see 
PUSI). In other words, it would cause classes HI and IV of mg to collapse into 
a single problem class. 

To the best of our knowledge, the above family of {k — spanner}^!^^^ ” prob- 
lems, 0 < /i < 1, forms the first example for a Class HI problem for which 
the upper and lower bounds converge to the same function when the parameter 
grows. Another family of problems, defined later on, which is also proved to 
enjoy a similar property, is the family of {MINREPtY°^i ” problems. 

As a direct consequence, we also extend the results of |SI on the strong inap- 
proximability of a number of generalizations of the fc-spanner problem beyond 
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what was known thus far. In particular, we establish strong inapproximabil- 
ity for the following problems, defined below. First, we extend the result of jHj 
for the uniform /c-spanner problem with stretch requirements in the range of 

1 < /c < 3, to any constant A: > 1. Similarly, the strong inapproximability result 
for the unit-weight fc-spanner problem with k in the range 1 < A: < 3 is extended 
to any constant A: > 1. The strong inapproximability result for the A:-spanner 
augmentation problem with k in the range 4 < A: = 0{n^) is now extended to 
3 < A; = O(n^). Moreover, in addition to the strong inapproximability of the dis- 
joint (DJ) and all client (AC) (and thus client-server (C-S)) A;-spanner problems 
for stretch requirements 3 < k = 0{n^), 0 < 5 < 1, established in we now 
conclude the strong inapproximability result for the all server (AS) A:-spanner 
problem for any constant k. 

The structure of the paper is as follows. Our main result is established by 
a sequence of reductions. We start from the MAX3SAT problem, and use it 
to show the inapproximability of the MAXSSATt problem presented in Sec- 
tion 13 Using this problem we establish the strong inapproximability of the 
MAXREPt and the MINREPt problems presented in Sections 0 0 respec- 
tively. The last reduction transforms instances of the MINREPt problem to 
the (A — l)-spanner problem. The MINREPt problem is a restricted version 
of the MIN REP problem, shown to be strongly inapproximable in HU. The 
restriction is geared at ensuring that the graphs underlying the given instances 
have girth greater than A, which is an essential component for facilitating the 
final reduction from MIN REP to the (A — l)-spanner problem. 

We believe the MINREPt problem, whose strong inapproximability is es- 
tablished here for every constant A, may be found useful in the future for prov- 
ing hardness results on other problems, that are not as hard as the (general) 
MIN REP problem. 

2 The M AXSSATt Problem 

Definition 1. For any maximization problem that attains values between 0 and 
1, we say that the problem is (1 — i5)-distinguishable if there is a polynomial time 
algorithm that given an input I is able to distinguish between the case 7T(7) = 1 
and the case II (I) < 1 — <5 (i.e., the algorithm returns 1 in the former case and 
0 in the latter; there are no guarantees on the behavior of the algorithm for an 
intermediate input I, such that 1 — 6 < II{I) < 1.) 



Definition 2. For problems II, II' and real j > 0, we say that 77 cx AT' A/ there 

exists a polynomial time reduction ip from inputs of II to inputs of U' , for which 
the following two properties hold for any instance I of II. 



(PI) 7/77(7) = 1 then n'{ip{I)) = 1. 

(P2) 7/77(7) < 1 - (5 then n'{ip{I)) < 1 - 5/7- 
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Lemma 1. If a problem II is (1 — S) -indistinguishable unless NP = P and 
n oe n' then the problem II' is (1 — 5 / ^) -indistinguishable under the same 
assumption. 

(Proofs are omitted from this extended abstract.) 

Recall that MAX3SAT is the problem of finding a truth assignment for a 
given boolean formula, that satisfies the maximal number of its clauses. 

Lemma 2. [1U| M AXSSAT is (1 — Sq) -indistinguishable unless NP = P for 
some 0 < < 1) unless NP = P. 

For every instance / of the MAX3SAT problem, construct the bipartite graph 
G/ = (L^R,E), where the set L contains a node v{cj) for every clause Cj and 
the set R contains a node v{xi) for every variable Xi occurring (positively or 
negatively) in /. An edge connects v{cj) and v(xi) if the variable Xi occurs in 
the clause Cj. 

Define the M AXiSATt problem to be the M AXSSAT problem with an ad- 
ditional restriction that the girth of the graph Gj corresponding to the instance 
I satisfies girth(Gi) > t. 

We start by presenting a reduction (p establishing that 
MAXiiSATt cx MAX'iSAT 2 t. Given an instance I of the MAXUSATt prob- 
lem, define the instance ip{I) as follows. For every clause c = {xi,X 2 ,X 3 ), define 
two auxiliary variables yi , t /2 that will be used for this clause only, and replace 
the clause by a set of three new clauses s(c) = {{x\,yi), (yi, X 2 , 2 / 2 ), ( 2 / 2 , X 3 )}. 

Similarly, for every clause c = (a;i, X 2 ) we present only one auxiliary variable 
yi and replace c by a set of two new clauses s(c) = {{xi,iji), ( 2 / 1 , X 2 )}. For a 
singleton clause c, set s(c) = {c}. 

We later make use of the following two easily verified observations. 

Observation 1 (1) For every truth assignment r for I there exists a eompletion 
t' for p{I) sueh that for every clause c of I, c is satisfied by r iff all the clauses 
in s(c) are satisfied by t' . 

(2) If the original formula I satisfies girth(Gi) = p then the resulting p{I) 
satisfies girth{G > 2p. 

For an input I and a truth assignment r, we denote the number of clauses 
in / by m/, the number of clauses that r satisfies in I by rni^r, and the ratio 
between them by Let /3| be the ratio fSi^r obtained under a 

best truth assignment r, i.e., /3J = max,-{/3 / t} . 

We now prove that the reduction p satisfies the properties (PI) and (P2) of 
Definition El The former follows immediately from Observation ^ implying 

Lemma 3. // /3J = 1 then = 1- 
Lemma 4. If (3 j < I — S then /?*(/) < 1 — (5/3. 
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Lemma 5. MAXSSATt (x MAX3SAT2t- 

As a result of this, and since MAX3SAT is equivalent to MAX 3 SAT 3 (be- 
cause the girth of any bipartite graph is greater than 3), we conclude, by a 
constant number p = [log2t/3] of applications of the above reduction. 

Lemma 6. For any constant integer t > 0, M AXSSAT oc M AXSSATt. 

We observe, however, that the lemma holds even if p is not a constant, but 
grows with n. This can be easily seen by the same proof argument. 

Lemma 7. For any constant integer t > 2, there exists a constant 0 < <5 < 1 
such that the M AXSSATt problem is {1 — S) -indistinguishable, unless NP = P. 

The above lemma can be extended to hold for non-constant values of t as 
well. In particular, we show the following. 

Lemma 8. There exist constants 0 < Ci < 1, C 2 > 1 such that for any constant 
0 < fj, < 1 the M AXSSATt problem with t = log°^^ n is {1 — 6 ) -indistinguishable, 
unless NP = P, where 5 = and Jq is the constant from Lemma 

3 The M AX RE Pi Problem 

Extending the definition given in the MAX REP problem is defined as 
follows. An instance of the problem consists of the pair Xi = (G,G), where 
G{L, R, E) and G{L, R, E) are bipartite graphs. L and R are each split into a 
disjoint union of hi and hr sets respectively, L = Ur=li ^ ~ Ur=i 

The numbers hi and hr satisfy hi < hr < hi ■ 2*°s ", for some 0 < J < 1, where 
n = \L\-\- |i?|. Let N = max{|t/,|, \Wj\ | 1 < z < h;, 1 < j < hr}- 

The second component in the instance A4 is a “supergraph” G = {L,R,E) 
induced by G and the partitions of L and R, namely, with “supernodes” L = 
{Ui, . . . , Ufi,} and R — {Wi, . . . , WnA’ supernodes Ui and Wj 

are adjacent in G iff there exists some nodes Ui G Ui and Wj G Wj which are 
adjacent in G, E = {{Ui, Wj) \ 3m G Ui, Wj G Wj s.t. {m,Wj) G E} . 

A set of vertices G is said to REP-eover the superedge {Ui, Wj) if there exist 
nodes Ui G C (lUi and W 2 G C (1 Wj which are adjacent in G. The set C is a 
MAX REP -cover for A4 if |C fl C/i| = 1 for every 1 < z < h; and \G C\Wj\ = 1 
for every 1 < j < hr. It is required to select a MAXREP-covei G for X4 that 
REP-covers a maximal number of superedges of G. 

We remark that the problem is defined here in a slightly more general form 
than in m- In particular, here we do not require graph-regularity on either 
bipartition. Also we no longer require the same number of supernodes at the left 
and at the right, but only that both numbers are “close.” We also do not require 
that the same number of nodes in a supernode are used at the left and at the 
right. 
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We also define the MAXREPt problem which is the same as MAXREP, 
with the additional restriction that the instances have girth{G) > t. It is shown 
in that the problem admits an 0(n'f+T )-approximation algorithm, and, in 
particular, that MAXREP admits a -y/n-approximation algorithm. In the next 
two sections we show that the MAXREPt problem (and in fact even the more 
restricted MAXREP^ problem, defined later) is 2*°s "-indistinguishable unless 
NP g DTIME{rE°^y^°a »). 

Lemma 9. MAX^SATt cx MAXREPt. 

Observe that the reduction always creates MAXREPt instances in which 
the supernode size is N = 0(1). Combining Lemmas 0 0and0we get 

Lemma 10 . For any constant integer t > 2, there exists a constant 0 < 5 < 1 
s.t. the MAXREPt problem is (1 — 5) -indistinguishable, unless NP = P. 

Combining Lemma 0 with Lemma 0 and Lemma 0 we get 

Lemma 11 . There exist constants 0 < Ci < 1, C 2 > 1 such that for any constant 
0 < fx < 1 the MAXREPt problem with t = log^^^ n is {1 — 6 ) -indistinguishable, 
unless NP = P, where 6 = and Jg the constant from Lemma 0. 

4 The Boosting Reduction 

Definition 3. For an instance A4 = (G,G), G = (L,R,E), G = (L,R,E) of 
MAXREP and an integer r >1, define the r -power of M as an instance At" 
constructed as follows. 

1. The set of left vertices M is the set of r -sequences (with repetitions) of the 
left vertices of G, i.e., M = L x . . . x L (r times). 

2. Similarly, = Rx ... x R (r times). 

3. There is an edge between u = {ui,U 2 , . . • , Ur) € M and w = {wi, W 2 , . . . , Wr) 
€ R^ , if for every i = 1,2, . . . ,r there are edges (ui, Wi) in G. 

4 . For every r-tuple of left supernodes (with repetitions) {Ui,U 2 , . ■ ■ ,Ur), de- 
fine a supernode of G'^ C(Ui,U2, ■ . ■ ,Ur), that contains all the nodes u = 
{u\,U 2 , . . ■ ,Ur) such that Ui G Ui for every i= 1,2, ...,r, and analogously 
for the righthand side. 



Lemma 12. For any instance M. of the M AX RE Pt problem, ifMAXREPt{M) 
= 1 then MAXREPt(M^) = 1. 



Lemma 13. For every graph G and integer r > 1, girth(G'~) > girth{G). 

The following lemma is stated implicitly in m for the LabeLGoverMAX 
problem. (It is also stated explicitly in 03], p.419, but only for regular super- 
graphs.) 
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Lemma 14. [16] Let M. be a MAXREP-instanee sueh that MAXREP{M.) < 
1 — 5 with the size of the supemodes bounded by a eonstant N. Then there exists 
a constant c = c{N,S) > 0 such that MAXREP(M^) < {M AXREP{M)Y'^ . 

It is shown in 0 that the Label -Cover max problem and the MAXREP 
problems are equivalent, hence we conclude 

Corollary 1. Let M. be a MAXREPt-instance such that MAXREPt{M.) < 
1 — 5 with the size of the supemodes bounded by a constant N. Then there exists a 
constant c = c{N, 5) > 0 such that XIAXREPt{MY < {M AX RE Pt{M)Y'^ . 



Lemma 15. For any instanceM of the MAXREPt problem, if MAXREPt{M) 
<1-5 then XIAXREPt{MY < (1 - 5)°^. 



Lemma 16. For any 0 < e < 1, the problem M AX RE Pt is 2 iogg n -indistinguish- 
able (or (1 — S) -indistinguishable with 5=1— ^iY<s n )> unless NP C DTLME 

f^polylog n\ 



Lemma 17. There exists a constant 0 < ci < 1 such that for any constant 

0 < fi < 1 and for any constant 0 < e < 1 — /i the MAXREPt problem with 

t = log'^^^n is '^-indistinguishable (or (1 — 6) -indistinguishable, where 5 = 

1 - unless NP C DTLME{nP°^y^°s 

5 The MINREPt Problem 

The MIN REP problem is the minimization version of the MAXREP defined 
in Sectional Both problems were introduced in El, and the XI IN REP problem 
was studied in 0. In the MIN REP problem, any number of nodes can be taken 
into REP-cover from one supernode (in contrast to the MAXREP problem, 
where we insist on taking at most one representative from each supernode), 
but it is required that the REP-cover should be proper, i.e., it should cover 
all the superedges. The problem is to find a minimal size proper REP-cover. 
We consider also the MINREPt problem, which is the MIN REP problem 
restricted to instances whose supergraph G has girth{G) > t. 

The MIN REP problem is proved to be strongly inapproximable in m In 
0 we have shown that this problem admits a -yAr-approximation algorithm, and 
more generally, that the MINREPt problem admits an n^/*^*“*'^^-approximation 
ratio. 

In this section we show that the MINREPt problem is strongly inapprox- 
imable, unless NP C DT I M E{nP°^y^°y "). This is done by presenting a reduc- 
tion from the MAXREPt problem, discussed in the previous sections, to the 
MINREPt problem, and then using Lemma El 
We make use of the following technical lemma. 
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Lemma 18. Let ai, . . . ,a 



nnbi, 

^2 



■ , > 0 such that Oi + YTj=i bj = z . 



Lemma 19. Given an approximation algorithm A for the MINREP^ problem 
with approximation ratio p, for any 0 < e < 1, the following properties hold for 
any instance A4 of the AIAXREPt problem. 

(Ql) If MAXREPtiM) = 1 then A{M) < p{hi + hr). 

(Q2) If MAXREPt(M) < then A{M) > 2 • 23>°s' " • hlhr. 

We observe that the randomized argument used in the proof can be deran- 
domized using the method of conditional probabilities. 

Theorem 2. For any 0 < e < 1 there is no approximation algorithm for 
the MINREPt problem with approximation ratio p < 2*°® ", unless NP C 
DTIME{nP°^y^°3 "). 



Lemma 20. There exists a constant 0 < ci < 1 such that for any constant 
0 < p < 1 and for every constant 0 < e < 1 — p, there is no algorithm for the 
MINREPt problem with t = log'^'^^n with an approximation ratio p < 2^°® ", 
unless NP C DTIME{nP°^y^°a "). 

The proof is the same as for Theorem El except that at the end we use Lemma 

El 

6 The fc-Spanner Problem 

The hardness of the basic A:-spanner problem for fc > 3 is proven by an extension 
of the reduction from the MIN REP problem to DJ C-S /c-spanner problem 
from Let M = (G, G) be a MINREP^+i instance, where G = (L,R,E), 
G = (L,R,E), L is a collection of disjoint subsets of L and i? is a collection 
of disjoint subsets of R. It can be checked in a polynomial time whether there 
exists a REP-cover G for Ai, by checking whether L U R REP-covers all the 
superedges. If L U i? does not REP-cover all the superedges, the instance has 
no MINREP-cover. Thus, without loss of generality, we assume that LG R does 
REP-cover all the superedges and, in particular, there exists a REP-cover for G. 
We set h = hi + hr and ki = ^^^d x = n^/h, and build the 

graph G as follows (see Fig. Pi. For any integer 2 ; > 0 we denote [z] = {!,..., z}. 
Define new vertex sets 

S = {sL, I i e [hi],i' e [ki],p e [a;]} and T = {tP.,j e [hr],f S [kr],p £ [a;]} , 
and set V = LG RG S GT and E = EG Egu U Etw U Eq U Em, where 
Em = <(,-+!)) I P G Nd G [hi\,i' G [h - 1]} 
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Fig. 1. The graph G. 



^ I P ^ / G [kr - 1], j G [nr]} , 

EsU = {(sfi.Ui) I Wi G C/i,i e [fii\,p G [x]} 

Etw = I Wj G Wj,j G [fir],p G [x]}, 

X 

Eq = , where 

p=i 

Denote also E^^ = | {Ui, Wj) £ G,1 < p < x}. Observe that for 

k = 3, ki = kr = 0 and thus Em = 0- 

Note also that each set of edges E^ is an isomorphic copy of the supergraph 
G. The graph is built in such a way that all the edges except Eq can be easily 
spanned, and the proof is based on establishing a connection between the size of 
the REP-cover and the number of edges required for spanning the edges of Eq. 
It is easy to see that for even values of k all the cycles in G are of even length. 
Thus in this case G is a bipartite graph. This enables us to show our hardness 
result for the basic fc-spanner problem, even when restricted to bipartite graphs. 
The following two observations are immediate. 

Observation 3 (1) No path k-spanning an Eq edge ean pass through some 
other Eq edge and an E edge. 

(2) No path using only Eq edges ean k-span another Eq edge. 
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The first observation holds because otherwise the length of the path would 
be longer than k. The second follows because girth{G) > k + 1. 

The next lemma claims, intuitively, that the copies cannot help each 
other in spanning their edges. 

Lemma 21. No path P such that P fl yf 0 can he used to k-span an edge 
from E^, for p ^ q. 

Corollary 2. No path P such that PDE^P yf 0 can k-span any E^P edge e such 
that e ^ P. 



Lemma 22. Let P he a simple path such that PC\Eq ^ 0. Let the edge e satisfy 
e € Eq \ P. Then P does not k-span the edge e. 

It follows that there are only two ways to fc-span the Eq edges in G, namely, 
either to self-span them, or to span them by a direct path via the original edgeset 
E, namely, a path of the type 



PjlPj2^ ■ 



t^ 

’ GK 



)• 



( „P „P „P m 

Note that the length of such a path is exactly ki -\- kr -\- 3 = k. Intuitively, edges 
in E are very cheap, and the really crucial edges are the stars from ufs and 
Wj’s to the first layer of S and T. Thus, we would really want to minimize the 
number of such stars, which would correspond to the needed number of nodes 
taken in the REP-cover. 

A fc-spanner iJ for the graph G is called a proper spanner if it does not use 
any edge of Eq, i.e., H fl Eq = 0. 

Corollary 3. A proper k-spanner H k-spans all the edges of Eq hy direct paths. 

A main observation made next is that forbidding the use of Eq edges does 
not degrade the spanner quality by much. 



Lemma 23. Any k-spanner El for G can he converted in polynomial time to a 
proper k-spanner H' of size \H'\ < 6\H\. 



Lemma 24. Given a k-spanner H for G there is a polynomial time constructihle 
REP-cover G for M = (G, G) of size |G| < ^ . 

For the opposite direction we prove the following lemma. 

Lemma 25. Given a REP-cover C for the MLN REP instance A4 = (G, G), 
there is a poly-time constructihle k-spanner H of Q of size \H\ < (fc — l)a;|G|. 

Now we are ready to prove our main result. 

Theorem 4. For every constant k > 2, the basic k-spanner problem is strongly 
inapproximable, even when restricted to bipartite graphs, unless 
NP C DTIME{nP°^y^°s »). 
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Again, we also generalize our result to the fc-spanner problem with a stretch 
requirement k = log*^^^^ n. 

Theorem 5. There exists a constant 0 < ci < 1 such that for any constant 
0 < /i < 1 and for every constant 0 < e < 1 — yc, the basic n)- spanner 

problem is -inapproximable, unless NP C DTIME(piP°^'^^°^ ”). 

The proof of the theorem is analogous to the proof of Theorem 0| with only slight 
changes in the analysis. It uses Lemma E3 except of|2|at the end. 

Improving the exponent 

We note that the lower bound of TheoremElis tight in the sense that the 
approximation algorithm of PI supplies an approximation ratio of n “"i " = 
2 syiog fc-spanner problem with fc = log^^^n. Furthemore, it cannot 

be extended to fc = logn, because the problem becomes 0(l)-approximable for 
this stretch requirement PI- However, the lower bound as is applies only to the 
fc-spanner problem with the stretch requirement log^ n with Q < p! < c\, and in 
the above analysis ci = log 3 2. Intuitively, this value of c\ follows from the fact 
that the reduction in Section |2| “loses” a factor of 3 on each iteration. 

Let us point out, however, that we are able to show that after sufficiently 
many iterations the reduction loses only a factor oi2 + r] on each iteration, where 
?7 > 0 is arbitrarily small. This observation can be used in order to extend the 
result for any constant 0 < ci < 1. The exact analysis will appear in the full 
paper. 
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Abstract. We show that a set of uniformly width-bounded infinite 
series-parallel pomsets is cu-series-rational iff it is axiomatizable in mona- 
dic second order logic iff it is cu-recognizable. This extends recent work 
by Lodaya and Weil on sets of finite series-parallel pomsets in two as- 
pects: It relates their notion of series-rationality to logical concepts, and 
it generalizes the equivalence of recognizability and series-rationality to 
infinite series-parallel pomsets. 



1 Introduction 

In theoretical computer science, finite words are a classical concept that is used 
to model the behavior of a sequential system. In this setting, the atomic actions 
of the system are considered as letters of an alphabet F. A natural operation 
on such sequential behaviors is the concatenation; it models that, after finishing 
one task, a system can start another one. Therefore, the natural mathemati- 
cal model is that of a (free) monoid F*. To model not only the behavior of a 
sequential system, but also allow parallelism, labeled partially ordered sets or 
pomsets were suggested mmui . In this setting, there is not only one, but 
there are (at least) two natural operations: A parallel system can start a new 
job after finishing the first one, or it can perform two jobs in parallel. These 
two operations are mathematically modeled by the sequential and the parallel 
product on pomsets: In the sequential product, the second pomset is set on top 
of the first. Complementary, in the parallel product, the two pomsets are put 
side by side. Thus, in the sequential product all events of the first factor are 
related to all events of the second while in the parallel product no additional 
relations are inserted. Another approach is that of Mazurkiewicz traces. Here, 
the sequentiality/parallelism is dictated by a fixed dependence relation on the 
set of actions. Therefore, the trace product (w.r.t. a given dependence relation) 
of two pomsets relates only dependent events of the two factors. 

Pomsets that one obtains by the sequential and the parallel product from the 
singletons are known as series-parallel pomsets. It was shown that finite series- 
parallel pomsets are precisely those pomsets that do not contain a subposet of the 
form N (hence their alternative name “N-free posets”) fOj. Together with the 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 64S-|^^2| 2000. 
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sequential and the parallel product, the finite N-free pomsets form an algebra, 
called sp-algebra, that generalizes the free monoid F*. The equational theory of 
this algebra was considered in m- Pomsets constructed from the singletons by 
the trace product are called traces. Together with the trace product, they form 
a monoid, called trace monoid. See ^ for a recent survey on the many results 
known on traces. 

Several models of computational devices are known in theoretical computer 
science. The probably simplest one is that of a finite automaton, i.e. a finite state 
device capable of accepting or rejecting words. Several characterizations of the 
accepting power of finite automata are known: A set of words L can be accepted 
by a finite automaton if it is rational (Kleene), axiomatizable in monadic second 
order logic (Biichi) or recognizable by a homomorphism into a finite monoid 
(Myhill-Nerode). Several attempts have been made to resume the success story 
of finite automata to pomsets, i.e. to transfer the nice results from the setting 
of a sequential machine to concurrent systems. For traces, this was achieved to 
a large extend by asynchronous (cellular) automata | 22 ] (see ^ for an extension 
to pomsets without autoconcurrency). For N-free pomsets, Lodaya and Weil 
introduced branching automata. In j 1 01 1 6) they were able to show that a set 
of finite width-bounded N-free pomsets is rational iff series-rational (i.e. can be 
constructed from the singletons by union, sequential and parallel product and 
by the sequential iteration) iff recognizable (i.e. saturated by a homomorphism 
into a finite sp-algebra). This was further extended in m by the consideration 
of sets that are not uniformly width-bounded. 

While finite words are useful to deal with the behavior of terminating systems, 
w- words serve as a model for the behavior of nonterminating systems. Most of 
the results on recognizable languages of finite words were extended to w-words 
(see m for an overview). For traces, this generalization was fruitful, too ElEl. 
Bloom and Esik 0 considered the set of pomsets obtained from the singletons 
by the sequential and the parallel product and by the sequential w-power. In 
addition, Esik and Okawa 0 allowed the parallel w-power. They obtained inner 
characterizations of the pomsets obtained this way and considered the equational 
theory of the corresponding algebras. 

This paper deals with the set of pomsets that can be obtained by the se- 
quential and the parallel product as well as by the infinite sequential product 
of pomsets. First, we show a simple characterization of these pomsets (Lemma 
[3 . The main part of the paper is devoted to the question whether Biichi’s cor- 
respondence between monadic second order logic on w-words and recognizable 
sets can be transfered to the setting of (infinite) N-free pomsets. Our main 
result. Theorem El states that this is indeed possible. More precisely, we con- 
sider w-series-rational sets, i.e. sets that can be constructed from finite sets of 
finite N-free pomsets by the operation of sequential and parallel concatenation, 
sequential iteration, sequential ^-iteration and union (without the w-iteration, 
this class was considered in ITblT^ L We can show that a set of infinite N-free 
pomsets is w-series-rational if and only if it can be axiomatized in monadic sec- 
ond order logic and is width-bounded. Our proof relies on a suitable (algebraic) 
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definition of recognizable sets of infinite N-free pomsets and on a deep result 
from the theory of infinite traces . 

Recall that Courcelle |2] considered the counting monadic second order logic 
on graphs of finite tree width. In this setting, a set of finite graphs is axiomatiz- 
able in Courcelle’s logic if and only if it is “recognizable” m- It is not difficult 
to show that any w-series-rational set of N-free pomsets is axiomatizable in this 
logic. If one tried to prove the inverse implication, i.e. started from an axiom- 
atizable set of N-free pomsets, one would yield a rational set of terms over the 
parallel and the sequential product. But, as usual in term languages, this set 
makes use of an extended alphabet. Therefore, it is not clear how to construct 
a series-rational expression without additional variables from this rational term 
language. For this difficulty, we chose to prove our main result using traces and 
not Courcelle’s approach. 

Let us finish this introduction with some open problems that call for an inves- 
tigation: First, we obtained only a relation between algebraically recognizable, 
monadically axiomatizable, and w-series-rational sets. It would be interesting to 
have a characterization in terms of branching automata, too. To this purpose, 
one first has to extend them in such a way that branching automata can run on 
infinite N-free pomsets. Second, we would have liked to incorporate the parallel 
iteration or even the parallel w-power in the construction of rational sets. This 
easily allows the construction of sets that cannot be axiomatized in monadic 
second order logic. Therefore, one could try to extend the expressive power of 
this logic suitably. 

2 Basic Definitions 

2.1 Order Theory 

Let {y, <) be a partially ordered set. We write a: || y for elements x,y G V if 
they are incomparable. A set A C 1/ is an antichain provided the elements of A 
are mutually incomparable. The width of the partially ordered set (V, <) is the 
least cardinal w{V, <) such that |A| < w{V, <) for any antichain A. If w{V, <) is 
a natural number, we say (V, <) has finite width. Note that there exist partially 
ordered sets that contain only finite antichains, but have infinite width. We write 
X — < y for x,y G V if X < y and there is no element properly between x and 
y. Furthermore, fy denotes the principal ideal {x G V \ x < y} generated by 
y G V . 

An N-free poset (V, <) is a nonempty, at most countably infinite partially 
ordered set such that the partially ordered set (TV, <m) cannot be embedded 
into (R, <) (cf. picture below), any antichain in (R, <) is finite, and fx is finite 
for any x gV . Let F be an alphabet, i.e. a nonempty finite set. Then NF°“(T) 
denotes the set of all T-labeled N-free posets (R, <, A). These labeled posets are 
called N-free pomsets. Let NF(T) denote the set of finite N-free pomsets over F. 
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xi X3 The poset (Af, <jv) 

Next, we define the sequential and the parallel product of T-labeled posets: 
Let ti = (Vi, <1, Ai) and ^2 = (V2, <2, A2) be T-labeled posets with I/i n V2 = 0 - 
The sequential product t\ ■ t2 of ti and ^2 is the T-labeled partial order 



{Vi U V2, <1 U <2 UVi X V2, Ai U A 2 ). 

Thus, in t\-t2, the labeled poset t2 is put on top of the labeled poset ti . On the 
contrary, the parallel product ti || t2 is defined to be 



(Vi U V2, <1 U <2, Ai U A2), 

i.e. here the two partial orders are set side by side. By SP(T), we denote the 
least class of T-labeled posets containing the singletons that is closed under the 
application of the sequential product • and the parallel product ||. 

To construct infinite labeled posets, we extend the sequential product • nat- 
urally to an infinite one as follows: For i £ lo, let U = (Vi,<i, Xi) be mutually 
disjoint T-labeled posets. Then the infinite sequential product is defined by 

= (U u u u u 

i<j 

By SP°“(T), we denote the least class C of T-labeled posets such that 

— SP(r) C e, 

— s,t £ C implies s || t G C, 

— s,t £ C and s finite imply s ■ t £ G, and 

— ti £ G finite for i G w implies Oigu; ^ 

Thus, a T-labeled poset belongs to SP°°(T) if it can be constructed from the 
finite T-labeled pomsets applying the sequential product, the parallel product 
or the infinite product. 

Based on results from ra, we extend the known equality SP(T) = NF(T) PUj 
to infinite T-labeled posets: 

Lemma 1 . Let F he an alphabet. Then SP°°(T) = NF“(T). 

Proof. By induction on the construction of an element of SP°°(T), one shows 
the inclusion SP“(T) C NF°°(T). For the converse inclusion, let t £ NF°°(T). 
We may assume that t is connected. By PJ Lemma 4 . 10 ], either t is an infinite 
sequential product of finite N-free pomsets, or there exist s G NF(T) and ti,t2 G 
NF°°(T) with t = s ■ {ti II ^2)- Then we can proceed inductively with fi and ^2- 
This inductive decomposition will eventually terminate since antichains in t are 
finite. Since NF(T) = SP(T), this finishes the proof. □ 
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The sequential, the parallel and the infinite sequential products can easily 
be extended to sets of (finite) N-free pomsets as follows: Let S C NF(T) and 
S',T' C NF°°{r). Then we define 

S ■ T' := {s ■ t \ s € S,t € T'}, 5+ := {si • S 2 • • • Sn | n > 0, Si G S}, 

S' II r := {s II t I s G G T} and S‘^ := {0,6^ s, | s, G S}. 

The class of series-rational languages is the least class C of subsets of 

NF(T) such that 

- {s} G C for s G NF(T), and 

- S' U T, 5 • T, 5 II T, 5+ G e for S', T G e. 

Note that we do not allow the iteration of the parallel product in the con- 
struction of series-rational languages. Therefore, for any series-rational language 
S there exists an n G w with w{s) < n for any s G S, i.e. any series-rational 
language is width-bounded. 

The class of uj-series-rational languages is the least class C of subsets of 
NF°“(T) such that 

- {s} G e for s G NF(T), 

- S U T, S II T G e for S, T G e, and 

- S+, S“, S • T G e for S, T G e and S C NF(T). 

For the same reason as for series-rational languages, any w-series-rational 
language is width-bounded. It is easily seen that the series-rational languages 
are precisely those w-series-rational languages that contain only finite labeled 
posets. 

2.2 Traces 

We recall the basic definitions and some results from the theory of Mazurkiewicz 
traces since they will be used in our proofs, in particular in Section 14.21 

A dependence alphabet (T, D) is a finite set F together with a binary reflexive 
and symmetric relation D that is called dependence relation. The complementary 
relation I = F^ \ D is the independence relation. From a dependence alphabet, 
we define a binary operation * on the set of T-labeled posets as follows: Let 
U = be disioint T-labeled posets (i = 1,2). Furthermore, let E = 

{{x,y) G X ^2 I (Ai(x),A 2 (y)) G D}. Then 



ti * t 2 ■ — (V\ U V 2 , Si U(<i oEo < 2)11 < 2 i Ai U A 2 ) 

is the trace product of <1 and t 2 relative to (T, D). Let M(T, D) denote the least 
class of T-labeled posets closed under the application of the trace product * 
that contains the singletons. The elements of M(T, D) are called traces. The set 
M(T, D) together with the trace product as a binary operation is a semigroup 
(it is no monoid since we excluded the empty poset from our considerations). 
Note that M(T, D) consists of finite posets, only. One can show that a finite 
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nonempty F-labeled poset (V,<,A) belongs to M.{r,D) if and only if we have 
for any x,y € V: 

(a) X — < y implies (A(a;),A(y)) G D, and (b) x\\y implies {\{x),X{y)) ^ D. 
This characterization leads to the definition of an infinitary extension of traces: 
A T-labeled poset (V, <, A) is a real trace if V is at most countably infinite, ^x is 
finite for any x G V, and (a) and (b) hold in (V, <, A). The set of real traces over 
the dependence alphabet {r,D) is denoted by K.(T, _D). Note that the infinite 
product ■ ■ of finite traces ti G M(T, D) can be naturally defined and 

yields a real trace. 

A set L C K.(T, D) of real traces is recognizable if there exists a finite 
semigroup {S, *) and a semigroup homomorphism 77 : M(T, D) — >• {S, *) such that 
- ■ ■ G L implies Sq * Si * S2 ■ ■ ■ G L for any finite traces Si,ti G M(T, D) 
with r]{ti) = r]{si) for i G co. 

2.3 Monadic Second Order Logic 

In this section, we will define monadic second order formulas and their interpre- 
tations over T-labeled pomsets. Monadic formulas involve first order variables 
x,y,z . . . for vertices and monadic second order variables X,Y, Z, . . . for sets 
of vertices. They are built up from the atomic formulas A(x) = a for a G F, 
X < y, and x G X hy means of the boolean connectives -i, V, A, — >■, O and 
quantifiers 3,V (both for first order and for second order variables). Formulas 
without free variables are called sentences. The satisfaction relation \= between 
T-labeled posets t = (F, <,A) and monadic sentences (p is defined canonically 
with the understanding that first order variables range over the vertices of V 
and second order variables over subsets of V. 

Let C be a set of T-labeled posets and p a monadic sentence. Furthermore, let 
L — {t G C \ t \= If} denote the set of posets from C that satisfy p. Then we say 
that the sentence p axiomatizes the set L relative to C or that L is monadically 
axiomatizable relative to 6. 

In jS], it was shown that a set of real traces is recognizable if and only if 
it is monadically axiomatizable relative to the set of all real traces. This result 
generalizes Biichi’s Theorem that states the same fact for oj-words. 

3 Prom cj-Series-Rational 

to Monadically Axiomatizable Sets 

Let t = (V, <, A) be some N-free pomset and X CV. Since any antichain in t is 
finite, the set X is finite if and only if it is bounded by an antichain. Hence the 
formula 3AVa, 6, cc(((a, 6 G AAa < 6) — >■ 6 < a) A (a: G A — >• 3c : (c G AAx < c))) 
expresses that the set X is finite. We denote this formula, that will be useful in 
the following proof, by finite (A). 

Lemma 2. Let F be an alphabet and let L C NF°°(T) be an uj-series-rational 
language. Then L is monadically axiomatizable relative to NF°°(T). 
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Proof. Clearly, any set {s} with s finite can be monadically axiomatized. Now 
let S and T be two sets of N-free pomsets axiomatized by the monadic sentences 
a and r, respectively. Then S'LIT is axiomatized by uVr. The set S' || T consists 
of all N-free pomsets satisfying 

3X{X yf 0 A 0 A Va;V2/(x &XAy^X^x\\y)A(j\XAT\ 

where cr f X is the restriction of a to the set X and r \ X‘^° that of t to the 
complement of X. The sequential product can be dealt with similarly. 

Next we show that S'*" can be described by a monadic sentence: The idea 
of a sentence axiomatizing S'*" is to color the vertices of an N-free pomset s 
by two colors such that the coloring corresponds to a factorization in factors 
s = Si ■ S 2 ■ S 3 ■ ■ ■ Sn where every factor Si belongs to S. The identification of the 
S-factors will be provided by the property of being a maximal convex one-colored 
set. More formally, we define (p = crV3X3F((/?i A:py Afinite(X) Afinite(T)) 
where (pi asserts that X and Y form a partition of the set of vertices such that 
vertices from X and vertices from Y are mutually comparable. The formula px 
states that the maximal subsets of X that are convex satisfy cr, i.e. 



and the formula py is defined similarly with Y taking the place of X. Asserting 
that the sets X and Y are finite ensures that the sentence p is satisfied by finite 
N-free pomsets, only. Hence we get indeed that p axiomatizes S'^ . 



The remaining pages are devoted to the converse of the above theorem, i.e. 
to the question whether all monadically axiomatizable sets are w-series-rational. 
Before we continue, let us sketch an idea how to tackle this problem, and explain, 
why we will not follow this way. Any N-free pomset is the value of a term over 
the signature {nr,l|} U r (where has arity w) . For a set L of N-free pomsets, 
let Tl denote the set of all terms over the given signature whose value belongs 
to L. Note that T/, is a set of trees labeled by the elements from ur. 

Similarly to 0, one can show that T/, is monadically axiomatizable whenever L 
is monadically axiomatizable. Hence, by the results from Tl is recognizable. 
This implies that Tl is a rational tree language over the alphabet {f([, ylljUTUA 
for some finite set X of additional symbols. Since I do not know how to infer 
that L is series-rational in case is rational over an extended alphabet, I follow 
another way in the proof of the converse of the above theorem. 

4 Prom Monadically Axiomatizable 
to Lj-Recognizable Sets 

4.1 w-Recognizable Sets 

Recall that a set of infinite words L C P‘^ is Biichi-recognizable if there exists 
a finite semigroup (S,*) and a semigroup homomorphism 77 : T+ — >■ (S', *) such 




Z C X A Z is convex A Z 0A 

VZ'(Z CZ'CXAZ'is convex -A Z = Z') 




To axiomatize S“, we can proceed similarly to the case S^. 



□ 
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that for any Ui,Vi € F* with r]{ui) = r]{vi) for i € to, we have UqUiU 2 ■■■ € L 
if and only if vqViV 2 ■■■ € L (cf. ITZI). Here, we use this characterization as a 
definition and transfer it into the context of N-free pomsets: 

Let S' be a set that is equipped with two binary operations • and || . We assume 
these two operations to be associative and, in addition, || to be commutative. 
Then (S, •, ||) is an sp-algebra. Note that the set of finite N-free pomsets is an 
sp-algebra. Mappings between sp-algebras that commute with the two products 
will be called sp-homomorphisms. 

Let X be a set of variables that will range over elements of NF(T). We call 
the terms over • and || that contain variables in X finite terms. Now let f be 
finite terms for i G uj. Then Oiew ^ term and any finite term is a term. 
Furthermore, if t is a finite term and f are terms for 1 < i < n, then t ■ ti and 
ti II ^2 II • • ■ tn are terms, too. Now let / : X — >• NF(T). Then f{t) G NF°“(T) is 
defined naturally for any term t. Let L C NF°°(T). Then L is ui -recognizable if 
there exists a finite sp-algebra {S,-, ||) and an sp-homomorphism ry : NF(T) — > 
(S',-, II) such that for any term t and any mappings f,g : X ^ NF(T) with 
Tjo f = r]o g^ we have f{t) G L if and only if g{t) G L. In this case, we will say 
that the sp-homomorphism rj recognizes L. In cni, recognizable subsets of NF(T) 
are defined: A set L C NF(T) is recognizable if there exists a finite sp-algebra 
{S, •, II) and an sp-homomorphism rj : NF(T) — >■ {S, •, ||) such that L = ri~^r]{L). 
One can easily check that L C NF(T) is recognizable in this sense if and only if 
it is w-recognizable. 

Example 3. Let (V, <) be a tree without maximal elements, such that is finite 
for any v G V, any node has at most 2 upper neighbors, and almost all nodes 
from V have only one upper neighbor. Let n be the number of branching points 
of (V, <). Then we call (V, <) a tree with n branching points. Note that (V, <, A) 
is an N-free pomset. 

Now let X be a set of natural numbers and let denote the set of all 
T-labeled trees with n branching points for some n G N . We show that Ln is 
tu-recognizable : 

We consider the sp-algebra S = {1,2} with the mapping rj : NF(T) — S de- 
fined by rj{t) = min(ui(f), 2) for any t G NF(T). To obtain an sp-homomorphism, 
let X \\ y = min(2 , x y) and x ■ y = max(x, y) for any x,y G S. Now let T be a 
term and f,g : X ^ NF(T) with rjo f = yog. Furthermore, assume /(T) G Tat, 
i.e. that /(T) is a tree with n G N branching points. As /(T) has no leaves, 
every parallel product || in T is applied to two non- finite terms and similarly 
the second factor of every sequential product • in T is a non-finite term. Hence 
every variable Xi (that occurs in T at all) occurs in T either as a left factor of 
a sequential product • or within the scope of an infinite product ||. Since /(T) 
is a tree, this implies that f{xi) is a (finite) linear order, i.e. w{f{xi)) = 1. Now 
rj o f = rj o g implies w{g{xi)) = 1. Hence the N-free pomset g{T) differs from 
the tree with n branching points f(T) only in some non-branching pieces. Thus, 
g(T) is a tree with n branching points, i.e. g(T) G Lj^ as required. Hence we 
showed that is indeed w-recognizable. 
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By the example above, the number of w-recognizable subsets of NF°°(/^) is 
2^”. Since there are only countably many monadic sentences or w-series-rational 
languages, not all w-recognizable sets are monadically axiomatizable or w-series- 
rational. Later, we will see that these three notions coincide for width-bounded 
sets. But first, we show that any tj-recognizable set of N-free pomsets of finite 
width is of a special form (cf. Proposition 0 . 

Let (S',-, II) be an sp-algebra. Then a pair (s, e) S S^ is linked if s • e = s 
and e • e = e. A simple term of order 1 is an element of S or a linked pair (s, e). 
Now let n > 1, tTi for i = 1, 2, ... n be simple terms of order rii, and s G S. Then 
s ■ (cti II (72 II • • • fJ„) is a simple term of order ni -I- ri 2 -I- . . . Un- 

For an sp-homomorphism 77 : NF(T') — >■ S and a simple term a, we define the 
language inductively: If (t G S, we set Lj^{a) := For a linked pair 

(s, e), we define Lrj{s,e) := r]~^{s) ■ (? 7 “^(e))“. Furthermore, L^is ■ (tJi || (T 2 || 

■ • • := V~^{s) ■ (Lriicri) || ^,,(0-2) || ■ ■ ■ L,,(o-„)). 

Lemma 4. Let F be an alphabet, (S, -, ||) a finite sp-algebra and rj : NF(T) — >■ 
(S, -, II) an sp-homomorphism. Let furthermore t G NF“(T) be an N-free pomset 
of finite width. Then there exist simple terms ri, T 2 , . . . , of order at most w{t) 
with n < w(t) and t G L,;(ti) || L^(t 2 ) || • • • L^{Tm). 

Proof. If t is finite, the lemma is obvious since the element s = pft) of S is a 
simple term of order 1. Thus, we may assume t to be infinite. First consider 
the case that t = infinite product of finite N-free pomsets ti. Let 

Si := rj{ti). A standard application of Ramsey’s Theorem m (cf. also CZI) yields 
the existence of positive integers ni for i G uj and a linked pair (s, e) G such 
that s = SqSi • • • Sng and e = s^+i ■ Sn ^+2 ■ ‘ ‘ Sm+i for i G to. Hence t G Lj^{s, e). 
Since (s, e) is a simple term of order 1 < w(t), we showed the lemma for infinite 
products of finite N-free pomsets. 

Now the proof proceeds by induction on the width w(f) of an N-free pomset 
t. By P, t is either a parallel product, or an infinite sequential product, or of 
the form s ■ {t\ || t 2 ) for some finite N-free pomset s. In any of these cases, one 
uses the induction hypothesis (which is possible since e.g. w{ti) < w{t) in the 
third case). □ 

The following lemma shows that the set Lrffir) is contained in any w-recogniz- 
able set L that intersects Lr^ij). 

Lemma 5. Let F be an alphabet, (5,-, ||) a finite sp-algebra, and rj : NF(T) — )> 
(S', -,11) an sp-homomorphism and Ti a simple term for 1 < i < m. Let fur- 
thermore t,t' G Lj^{ti) II Ljj{t2) II ■■■L^{Tm). Then there exist a term T and 
mappings f,g:X^ NF(T) with rjo f = rjo g such that f{T) = t and g{T) = t' . 

Proof. First, we show the lemma for the case m = 1 (for simplicity, we write r for 
Ti): In this restricted case, the lemma is shown by induction on the construction 
of the simple term r. For t = s G S, the term T = x and the mappings 
f{y) = t and g{y) = t' for any y G X have the desired properties. For a linked 
pair r = (s,e), the term T = can be used to show the statement. For 
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r = s • (ti II T 2 II • • • r„), one sets T = x{Ti || T 2 || • • • T„) where x G X and Ti 
is a term that corresponds to r^. We can assume that no variable occurs in 
and in Tj for i ^ j and that x does not occur in any of the terms Ti. Then the 
functions fi and gi, that exist by the induction hypothesis, can be joint which 
results in functions / and g satisfying the statement of the lemma. □ 

Let L be an w-recognizable set of N-free pomsets of finite width. The following 
proposition states that L is the union of languages of the form L^iTi) || Lt;(t 2 ) || 

• • • But this union might be infinite. The proof is immediate by Lemmas 

SandO 

Proposition 6. Let T be an alphabet and L C NF°“(T) be a set of N-free pom- 
sets of finite width. Let L be reeognized by the sp-homomorphism r] : NF(T') — > 
(fy •, II), and let T denote the set of finite tuples of simple terms (ti, T 2 , . . . , Tm) 
such that % ^ LC\ (L,,(ri) || || • ■ ■ Lr^{Tm))- Then 

L= (L^(ti) II L,j(t2) II ■ ■ ■ Lr^ijm))- 

(ri,T2,...,Tm)GT 

4.2 Monadically Axiomatizable Sets of Bounded Width 

Now, we concentrate on sets of N-free pomsets whose width is uniformly bounded 
by some integer. It is our aim to show that a set of bounded width that is 
monadically axiomatizable relative to NF°°(T) is w-recognizable. 

Let (fy<) be a partially ordered set. Following we define a directed 
graph (V,E) = spine(fy <), called spine, of (fy<) as follows: The edge relation 
is a subset of the strict order <. A pair (x,y) with x < y belongs to E if either 
X — < y or for any z G V we have x — < z ^ z < y as well as z — < y ^ x < z. 
Thus, {x, y) G E \i either x — < y or x < y and any upper neighbor of x (any 
lower neighbor of y) is below y (above x, respectively). 

Let maxco (spine (V, <)) denote the maximal size of a totally unconnected set 
of vertices in the graph spine (fy <). The restriction of the following lemma to 
finite partially ordered sets was shown in The extension we use here is an 
obvious variant of this result: 

Lemma 7 (cf. |12,]). Let n > 0. There exists a dependence alphabet (Tn,Dn) 
with the following property: Let (V, <) be a poset with maxco(spine(y, <)) < n 
such that fx is finite for any x G V. Then there exists a mapping A : L — >■ T„ 
such that (fy<,A) G M(r'„,iA„) is a real trace over the dependence alphabet 
(r„, Dn). 

We introduced the spine of a partially ordered set and mentioned the result 
of Hoogeboom and Rozenberg since the spine of an N-free partially ordered set 
is “small” as the following lemma states. 

Lemma 8. Let E be an alphabet and t = (fy <,A) G NF°“(L') be an N-free 
pomset of finite width. Then maxco(spine(f)) < 2w(t) ■ (w(t) -\- 1). 
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Let r and S be two alphabets. A set of A-labeled posets L is the projection 
of a set M of A-labeled posets if there exists a mapping tt : A — >■ A such that 
L = {{V, <, 7TO A) I (F, <, A) G M}, i.e. L is the set of relabeled (w.r.t. tt) posets 
from M. 

Now let L be a set of N-free pomsets of width at most n. Then the two 
lemmas above show that L is the projection of a set of real traces over a finite 
dependence alphabet. Because of this relation, we now start to consider sets of 
N-free real traces: 

Languages of N-free real traces. Recall that we want to show that any monadi- 
cally axiomatizable set of N-free pomsets is w-recognizable. By |^, any monad- 
ically axiomatizable set of (N-free) real traces is recognizable. Therefore, we 
essentially have to show that any set L C M(T, D) of N-free real traces that is 
recognizable in K.(T', D), is w-recognizable in NF(T). Thus, in particular we have 
to construct from a semigroup homomorphism t] : M(T, D) — >■ (S, *) into a finite 
semigroup (S,*) an sp-homomorphism 7 : NF(T) — >■ ||) into some finite 

sp-algebra. This is the content of Lemma nni that is prepared by the following 
definition and lemma. 

Let alph^(R, <,A) := (A o min(R, <), A(R), A o max(R, <)) for any finite T- 
labeled poset (R, <, A). Let 6 be a set of finite T-labeled posets (the two exam- 
ples, we will actually consider, are C = NF(T) and C = M(T, D)) and 77 : 6 — >■ S' 
a mapping. Then 77 is strongly alphabetic if it is surjective and if r](ti) = 77(^2) 
implies alph^(ti) = alph^(t2) for any ti = {Vi, <, Aj) G C. Using the concept of a 
dependence chain, one can easily show the existence of a semigroup homomor- 
phism 77 into a finite semigroup such that 77 is strongly alphabetic: 

Lemma 9. Let (T, D) be a dependence alphabet. There exists a finite semigroup 
(S, *) and a strongly alphabetic semigroup homomorphism 77 : M(T, D) — >■ (S, *). 

Lemma 10. Let {T, D) be a dependence alphabet and rj : M(T', D) — >■ (S, *) be 
a strongly alphabetic semigroup homomorphism into a finite semigroup (S, *). 
Then there exists a finite sp-algebra (S+, - ,11) with S C S+ and a strongly alpha- 
betic sp-homomorphism 7 : NF(T) — >■ (S+,-, ||) such that 

1. rj{t) = 7(f) for t G M(T, D) fl NF(T) and 

2. 7(f) G S implies t G M{T, D) nNF(T) for any t G NF(T). 

Proof Since 77 is strongly alphabetic, there is a function alph^ : S — >■ (T(T) \ 
{0})^ with alph^(77(t)) = alph^(t) for any trace t G M(T, D). From the semigroup 
(S, +) and the function alph^, we define an sp-algebra (Si, -,11) as follows: Let 



Si = S0{7{r) \ {0})^ and extend the function alph^ to Si by alph^(A') = X for 
A G (T(r) \ {0})3. Now let 



x-y = 




( 



alph^(a;) alph^(7/) otherwise 



X *y 



if 7T2 o alph^(a;) x 7T2 o alph^(7/) C I 



x\\y = 
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for any x,y G S where is the componentwise union of elements of (iP(F)\{ 0 })^. 

Let furthermore x-X = X- x = x\\X = X\\x = XLI^ alph^(a:) for any 
X G Si and X G (iP(d^) \ { 0 })^- One can easily check that the mappings • 
and II are associative and that the parallel product || is commutative. Now let 
7 : NF(I^) — >• Si be defined by 7(f) = 77(f) for t G M{r,D) fl NF(F) and 
7(f) = alph^(f) for t G NF(F) \ M(F, D) and let S~^ be the image of 7. Then 7 
is a strongly alphabetic sp-homomorphism onto {S^ , •, ||). □ 



Lemma 11. Let (r,D) be a dependence alphabet. Then R(F, H) nNF“(T') is 
Lu -recognizable. 

Proof. By Lemma El there exists a strongly alphabetic semigroup homomor- 
phism a : M(F, D) — ?> (T, *) into a finite semigroup (T, *). Then, by Lemma E3 
we find a strongly alphabetic sp-homomorphism 77 : NF(T') — >■ (S', •, ||) that coin- 
cides with a on M(T, II)nNF(T) such that 77(f) G T implies f G M(T, Z?)riNF(T). 
Let furthermore f,g:X^ NF(T) be functions with go f = -qo g. By induction 
on the construction of a term f, one shows that /(f) is a trace if and only g{t) is 
a trace. □ 

From a term f, we construct a finite or infinite sequence lin(f) over X in- 
ductively: First, lin(a:i) = {xi). Now let L for z G a; be finite terms. Then 
lin(fi • t2) = lin(fi || t2) is the concatenation of the sequences lin(fi) and lin(f2). 
Similarly, lin(nig„ tf) is the concatenation of the sequences lin(fi). Now let fi, f2 
be terms and f be a finite term. Then lin(f • fi) is the concatenation of the 
sequences lin(f) and lin(fi) (note that lin(f) is finite). If lin(fi) is finite, let 
lin(fi II t^) be the concatenation of lin(fi) with lin(f2) and, if lin(fi) is infinite 
but lin(f2) is finite, let lin(fi || t2) be the concatenation of lin(f2) with lin(fi). 
If both lin(fi) and lin(f2) are infinite, lin(fi || f2) is the alternating sequence 
{y\,yl,yhyl,---) with lin(f,) = {y\,y\,...) for f = 1,2. 

For a term f with lin(f) = (j/i, t/2j ■ • • ) and a mapping f : X ^ NF(T), 
let ★(/, f) := f{yi) * /(j/2) * /(f/s) • • ■ denote the infinite trace product of the 
pomsets f{yi). Note that -k{f,t) is a T-labeled poset that in general need not 
be a trace nor an N-free pomset. The following lemma implies that in certain 
cases ★(/, f) is a real trace. It is shown by induction on the depth of a term. 

Lemma 12. Let (P, D) be a dependence alphabet. Let t be a term and / : X — >■ 
NF(T). ///(f) is a real trace, then f{t) — -k{f,t). 

Now we can show that at least any monadically axiomatizable set of N-free 
real traces is w-recognizable: 

Proposition 13. Let (T, D) be a dependence alphabet and >c be a monadic sen- 
tence. Then the set L = {t G R(T, D) fl NF°°(T') \ t \= x} is uj -recognizable 

Proof. By E|, the set L is a recognizable language of real traces. Hence there is 
a semigroup homomorphism 77 : M(T, D) — >■ ( 5 , *) into a finite semigroup (S', +) 
such that, for any sequences Xi,yi of finite traces with q{xi) = q{yi), we have 



660 



D. Kuske 



Xi * X2 * ■ ■ ■ G L if and only if yi * 1/2 * ■■ ■ G L. By Lemma |3 we may assume that 
77 is strongly alphabetic. By Lemma HTlt there exists a finite sp-algebra ( 5 '+, •, ||) 
and a strongly alphabetic sp-homomorphism 77+ : NF(/^) — >■ S~^ that coincides 
with 77 on M(F, D) fl NF(/^). 

By Lemma im there exists an sp-homomorphism S : NF(F, D) — >■ (T, -, 11 ) 
that recognizes K.(F, D) fl NF(F). Now let a = 77+ x 5 : NF(F) S x T. We 
show that a recognizes L\ 

Let t be a term with lin(t) = (7/1, 772, ■ • ■ ) and f,g:X^ NF(r) be mappings 
with ao f = ao g. Suppose f{t) S L. Then f{t) is a real trace. Since So f = Sog 
and S recognizes K(F, D) nNF°°(F), the F-labeled poset g{t) is a real trace, too. 

From LemmaEl we obtain f{t) — ★(/, t) and g{t) — ir{g,t). Since f{t) S L, 
we have in particular /(7/1) * /(ys) * fivs)--- S L. Note that f{yi),g{]ji) G 
M(F,D)nNF(F) and that 77+0/ = r]~^og. Since 77+ and 77 coincide on M(T,D)n 
NF(F), this implies rj{f{yi)) = v{g{yi))- Since 77 recognizes the language of real 
traces L, we obtain g{yi) * 5(7/2) * 5(2/3) • • • G L. □ 

Languages of N- free pomsets. Following LemmaEl we explained that any width- 
bounded set of N-free pomsets is the projection of a set of N-free real traces over 
some dependence alphabet. This is the crucial point in the following proof. 

Proposition 14 . Let m G lu, let S be an alphabet and ip a monadic sen- 
tence over S. Then the set L = {t £ NF°°(i 7 ) \ t \= (p and w{t) < m} is 
uj -recognizable. 

Proof. Let (Pn^Dn) be the dependence alphabet from Lemma 0 with n = 
2 m{m -\- 1 ). Now let T = Pn x S and {(A,a),{B,b)) G D iff (A,B) G 
for any {A, a ), {B, b) G P. Let 7 T(y, <, A) = (V, <, tt2 o A) be the canonical pro- 
jection from r -labeled posets to the set of A 7 -labeled posets and consider the set 
iF := {t G K(T, £>) | 7 T(t)GL}. 

By Lemma 0 n{K) = L. In the monadic sentence ip, replace any subformula 
of the form A(a;) = a by X{x) = {A, a) and denote the resulting sentence 

by K. Note that x: is a monadic sentence over the alphabet P and that K = {t £ 
M(T, D) \t\= >c]. Since K C NF°°(T), we can apply Proposition^^ and obtain 
that K is 07 -recognizable. Now one can show that the class of cj-recognizable 
languages is closed under projections which gives the desired result. □ 

5 Prom cj-Recognizable to cj-Series-Rational Sets 

To finish the proof of our main theorem, it remains to show that any w-recogniza- 
ble set whose elements have a uniformly bounded width is w-series-rational. This 
result is provided by the following proposition: 

Proposition 15 . Let P be an alphabet and L C NF°°(T) be w -recognizable and 
width-bounded. Then L is uj-series-rational. 

Proof. Let 77 : NF(T) — >■ (S', -, 11 ) be an sp-homomorphism that recognizes L. 
Furthermore, let n G a; such that w{t) < n for any t £ L. Now let T denote 
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the set of (< n)-tuples of simple terms . . . ,ti~) of order at most n such 

that 0 7 ^ L n (L^(ti) II L^{t 2 ) || Note that T is finite. The proof 

of Proposition ini yields that L is the union of the languages Lr^(Ti) || Lr;(T 2 ) || 

■ ■ ■ Lrj{Tk) over all tuples from T. 

Hence it remains to show that L^{t) is w-series-rational for any simple term 
r such that there exists (ui, a 2 , - ■ ■ , Uk) G with ai = t for some 1 < i < fc. 
This proof proceeds by induction on the subterms of r and uses in particular 
the fact that any width-bounded and recognizable set in NF(T) is series-rational 

jH|. □ 

Now our main theorem follows from Lemma El Propositions II 41 a.nd ITHl 

Theorem 16. Let T be an alphabet and L C NF'^(T). Then the following are 
equivalent: 

1. L is oj-series-rational. 

2. L is monadically axiomatizable relative to NF“(T) and width-bounded. 

3. L is u! -recognizable and width-bounded. 

Recall that a set of finite N-free pomsets is recognizable in the sense of HS| 
if and only if it is w-recognizable. Therefore, a direct consequence of the main 
theorem (together with the results from |1 1 6] where the remaining definitions 
can be found) is the following 

Corollary 17. Let T be an alphabet and L C NF(T) be width-bounded. Then L 
can be accepted by a branching automaton iff it is recognizable iff series-rational 
iff monadically axiomatizable. 
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Abstract. In this paper we give a proof that it is decidable for a de- 
terministic tree automaton on infinite trees with Rabin acceptance con- 
dition, if there exists an equivalent nondeterministic automaton with 
Biichi acceptance condition. In order to prove this we transform an arbi- 
trary deterministic Rabin automaton to a certain canonical form. Using 
this canonical form we are able to say if there exists a Biichi automaton 
equivalent to the initial one. Moreover, if it is the case, the canonical 
form allows us also to find a respective Biichi automaton. 
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1 Introduction 

Automata on infinite sequences and infinite trees were first introduced by J.R. 
Biichi (1962) P and M.O. Rabin (1969) P in order to find the proof of de- 
cidability of monadic logic of one and many successors respectively. Later they 
have also proved useful in modeling finite state systems whose behaviour can 
be infinite. They have found applications in describing behaviour of concurrent 
processes in real time systems or in construction of automatic program verifi- 
cation systems. There exist several classes of finite automata having different 
expressive power. Among them a meaningful role is played by automata with 
Biichi and Rabin acceptance conditions (see ||6|7| ) . The theory of these automata 
gives rise to many natural questions which have not been answered until now. 
One of these is treated in this paper. It is known that the classes of tree lan- 
guages recognized by Rabin automata with successive indices induce a proper 
hierarchy (see P]). Languages recognized by Biichi automata are located at a 
low level of this classification. In this context, it is natural to ask if, given some 
Rabin automaton A for a language L, we can determine the minimal index of 
a Rabin automaton recognizing L. For deterministic automata on infinite words 
this problem has been solved. T. Wilke and H.Yoo |E| gave an algorithm which 
computes the aforementioned minimal index for any Rabin language consist- 
ing of infinite words. At the same time there have been attempts to consider a 
similar problem for automata on infinite trees in case of some more restricted 
subclasses of Rabin automata. In the paper P) by D.Niwihski and I.Walukiewicz 
there can be found a method of computing the minimal Rabin index for the class 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 663-|^7^ 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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of infinite tree languages, whose all paths are in some oj— regular set of infinite 
words. The problem in general has not been yet solved for automata on infinite 
trees. In this paper we make one step in this direction, giving a procedure which 
determines if a language accepted by a deterministic Rabin automaton can be 
accepted by a (possibly nondeterministic) Biichi automaton. In terms of the hi- 
erarchy described below it means determining if the minimal index of a language 
accepted by a deterministic tree Rabin automaton is equal 1. Our problem is 
related to the well-known open question of determining the alternation level of a 
formula of the ^.-calculus. Recently M. Otto [0| showed a decision procedure to 
determine whether a formula is equivalent to an alternation-free formula. It is 
known that the expressive power of Biichi automata corresponds to that of the 
formulas of the alternation level however we do not know of any direct fj,- 
calculus characterization of deterministic automata. Nevertheless we may hope 
that extension of our techniques can help to solve, at least to some extent, the 
aforementioned ^-calculus problem. 

2 Basic Definitions and Notations 

2.1 Mathematical Preliminaries 

Let Lo denote the set of natural numbers. For an arbitrary function /, we denote 
its domain by Dom(/). Throughout the paper S denotes a finite alphabet. For 
a set X, X* is the set of finite words over X, including the empty word e. We 
denote the length of a word w by |ry| (|e|=0). We let denote the set of 
infinite words over X. The concatenation wu of words w G X* and u G X* UX“ 
is defined as usual. A word u> G X* is a prefix of u € X* U X‘^, in symbols: 
w < V, if there exists u such that v = wu; the symbols >, > etc. have their usual 
meaning. 

An infinite binary tree over S is any function t : {0, 1}* — >■ E. We denote the 
set of all such functions by T^- An incomplete binary tree over A is a function 
t : X ^ E, where X C {0, 1}* is closed under prefixes, i.e. wGXAw'<w^ 
w' G X and satisfies the following condition: Vw G X{w0 G A Awl G A)V(wO ^ 
X A wl ^ X). The set of incomplete binary trees over E is denoted by TP^- 
Note that Q TP^. A node w G Dom{t) is a leaf of t if wO ^ Dom{t) and 
wl ^ Dom{t). We denote the set of leaves of t by Fr(t). A node which is not 
a leaf is called the inner node. Let t G TPs and w G t. Then the symbol t.w 
denotes a subtree of the tree t induced by a node w defined by 

Dom{t.w) = {u I wv G Dom{t)} 
t.w{v) = t{vw), for V G Dom{t.w) 

A word 7T from the set {0, is a path in a tree t G TPs if all its finite prefixes 
are in Dom{t). A word tt from the set {0, 1}* is a path in the tree t G TPs if 
7T G Dom{t) and tt G Fr(t). Throughout this paper whenever we use the symbol 
7T we will mean a path in the sense of the above definitions. 

Below we present some notations used throughout this paper for certain subsets 
of {0,1}*: 
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[w, u] = {u G {0, 1}* I lu < u < f} 

{w, v) = {u G {0, 1}* \ w < u < v} 

TT^ = {u G {0, 1}* I W < M < 7t} 

We write 3\x<j){x) to mean that there exists exactly one x satisfying formula <p{x), 
similarly a formula 3°°x<f>{x) means that there exist infinitely many x’s satisfy- 
ing 0(cc). A symbol V will denote exclusive or. For a n-tuple x = {g\, ■■■,gn) we 
define x\i = gt. 

Now we will introduce some notations concerning directed graphs, used through- 
out this paper. A cycle S in an arbitrary graph Q{V,E) is a sequence of nodes 
tti, 02 , . . . , ttfc such that k > 2, ai = Ofc, and (oj, Oi+i) G E, for i G {1, . . . , fc — 1}; 
it will be denoted by [oi, 02 , . . . , a^]. For a cycle S = [oi, 02 , . . . , Ofc] a notation 
(S) denotes the set { 01 , 02 , . . . , 0 ^}. Given an arbitrary vertex subset X and a 
cycle S' in a graph Q{V,E) we say that the cycle S omits the set X if (S) fl X 

= 0. A connected component of a vertex p G V with respect to the set X C V 

is defined as follows: 

Con(A,p) = {qGV\ 3S— a cycle in a graph Q such that ((S) fl A = 0 A 

p,qG (S))} 

The set of all connected components with respect to some set A in a graph Q is 
denoted by Con(A, Q). 

2.2 Stopping Automata 

Definition 1. A nondeterministic stopping automaton with Rabin ac- 
ceptance condition is a 6-tuple = (Q, A, fl, A), whose components sat- 

isfy the following conditions, respectively: 

- qo G Q 

- A C Q X X X {QU A) X {QU A) 

- fl = {{Li,Ui),..., (L„, [/„), (0,C/„+i)} where Vi G {I, . . . Li,U^ C Q 

- QnA=0 

Q is a finite set of states. A describes a finite set of stopping states disjoint with 
Q. fl will be called the set of acceptance conditions. Sometimes when the set of 
stopping states is empty we will omit the last component in the above 6-tuple 
and we will not call such an automaton a stopping one. 

In this paper there will also appear another type of stopping Rabin automa- 
ton, whose initial state is omitted, what we denote by {Q, X,—, A, fl, A). If a 
construction requires giving the initial state, we will use the following notation: 

Definition 2. For an arbitrary stopping automaton A, A[p] denotes a stopping 
automaton A with the initial state set to p. 

A run of a stopping automaton A on a tree t G is an incomplete tree 
r-t A G TPqoA defined by conditions: 

- rt^A{e) = qo 
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- Vw e {Dom{rt,A) \ Fr(rt,A)) {rt,A{w) ,t{w) ,rt,A{wO) ,rt^A{wl)) £ A. 

- Vw £ Fr{rt,A) n,A{w) £ A 

- Vw e (Dom(rt,A) \ Fr(rt,A)) rt,A(u>) £ Q 

Note that once the automaton reaches a stopping state it stops operating. For 
a run r^ A and an arbitrary set X C {0, 1}* a notation rt A{X) will be used to 
denote a set {rt^A{w) \ w G X}. We also introduce another notation concerning 
the runs of stopping automata: 

In{n,A\T^) = {9 I 3“ w < 7 T {rt,A{w) = q)} 

We say that a run rt^A is accepting, what we denote by rt^A £ Acc, if: 

Vtt (3i £ {1, . . . n + 1} (In(rt,A|7r) (1 Li = 0 A In(rt_^|7r) AUi ^ 0)) 

We reserve the symbol L(^) for a language accepted by a stopping automaton 
A defined by: 

L{A) = {t gTs\ 3rt,A £ Acc} 

In this paper we assume that all stopping automata in consideration will satisfy 
the following conditions: 

- 'ia G X Mq G Q 3p, r £ {Q U A) ( 9 , a,p, r) G A 

- 17 = {(Li, C/i), (L 2 , C/ 2 ), . . . , (i„, Un), (0, C/„+i)} , where: 



Observe that the first dependency guarantees the existence of a run of a stopping 
automaton over every tree. States which satisfy the last condition are said to be 
reachable in some accepting run. It is easy to see that the imposed conditions 
do not diminish the expressive power of the stopping automata. We would like 
to emphasize that sets Ui for i G {1, . . . , n + 1} can be empty. The index of a 
stopping automaton A is denoted by: 



We write Ind(l7) instead of Ind(.4), where 17 is the set of acceptance conditions 
of a stopping automaton A. Moreover for the elements of the set 17 we use the 
subsequent notation: 



Definition 3. Stopping automata with index 1 are called Biichi automata. 

Languages accepted by some stopping automaton with Rabin (Biichi) acceptance 
condition are called Rabin (Biichi) languages. It is worth noting that classes 
of languages recognized by automata with Rabin acceptance condition having 
successive indices induce a proper hierarchy (see 0). 



n 



(Vz £ {1 . . . n} {L,^0 A L,r\Ui = 0)) A (C/„+i n |J (^^ U U,) = 0) 



n 



Vx £ [j L^U Un+i 3t £ Ts 3rt,A G Acc 3w G {0, 1}* rt,A(w) = x 




'ix = (L„ U,) G n {X\i = L, A X\2 = Ui) 
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Definition 4. A stopping automaton A is deterministic if it satisfies the 
following eondition: 



WqGQ yq', 



q”,p', 



p" G (QL) A) VaG S { {q,a,q',q”) G A /\ {q,a,p',p") G A 
^ q' = p' A q” = p" ) 



Note that if we deal with deterministic automata then we can assume that there 
exists exactly one run of such an automaton over some tree. 

Definition 5. We say that a stopping automaton A is frontier deterministic 
if the following dependency holds true: 

yt G Ts yrt,A, St, A ( Fr {rt,A) = Fr{st,A) A ( Vw GFr{rt,A) D,a(w) = St,A{w))) 

Consequently, frontier determinism guarantees that for some complete tree all 
possible runs stop in the same nodes reaching the same states. 

We will use the notation ^ ~ to denote equivalence between frontier de- 
terministic stopping automata A and B. This concept is expressed formally as 
follows: 

- L{A) = L{B) 

- yt G Tsyrt,A,St,B (Flirt, a) = Fr(st.s) A Vw S Fr{rt,A) rt,A{w) = St,B{w)) 

Observe that the second condition refers to all runs of automata A and B, not 
only to the accepting ones. 

Set X C Dom(t) is an antichain with respect to the order relation < if any two 
elements of X are incomparable. Let / be a function associating trees t(w) G 
with elements w from X. Then we denote the substitution by f[f], the limit 
of the sequence by lim tn and the iteration of t along the interval [u, re] by 
fFM, Definitions of the above concepts and also the concept of the trace of 
iteration are well known and can be found e.g. in the paper by Niwinski Pj 



3 Relevant States 



Let A={Q, S, (7o, A, F2, A) denote an arbitrary stopping automaton, where: F2 = 
{(Li, C/i), (L 2 , C/ 2 ), . . . , (L„, Un), (0, C/„+i)}. 



n 

Definition 6. We say that some state q G [J Li is irrelevant in a run rt,A 

i—1 

if its occurrences are covered by a finite number of paths in this run: 



3 7Ti, . . . , TTfc e {0, 1}“ {rt,A{w) = q ^ 3 i G {I, . . . , k} w < (1) 



n 

Furthermore we say that a state q G Li is irrelevant for an automaton A if 

i—1 

its occurrences are irrelevant in an arbitrary accepting run of this automaton. 
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A state which is not irrelevant is called relevant. Observe that if a state p is 
irrelevant for a stopping automaton A then for an arbitrary tree t € L(A) and 
an arbitrary accepting run there exists a natural number K such that: 

V w S {0, = p A\w\ > K) BItt {w < tt Ap € Jn(rt_^|7r)) (2) 

Less formally, if in some node below level K the automaton reaches a state p 
then this state must occur infinitely often on some uniquely determined path 
going through this node. 

Definition 7. A gadget of states p and q admitted by an automaton A is a 
4-tuple G{p,q) = {gi, g 2 , 93 , Pi) with gi € {0,1}* satisfying the following eondi- 
tions: 



- 9i < 92 < 93 ^ 9i < 92 < 9i 

- 53 ^ 54 A 53 ^ 34 

- 3 t e L{A) 3rt^A e Acc 3j S (1, . . . , n}( rt^A{gi) = P A rt^A(.9i) = q A 
n,A{92) = rt,A{93) =ueUt A rt,A{[92,93]) O L, = 0) 

- rt,A{[gi,93]) n Un+i = 0 A r(,^([5i,54]) O 17„+i = 0 



^From now on when we say that an automaton A admits a gadget G in a tree 
t in a run rt A, we mean that t and rt A satisfy the two conditions above. Note 
that in this case the language accepted by the automaton A includes also the 
iteration of the tree t along the interval [52 , 53 ] • 

Definition 8. An automaton A has the accessibility property iff 

n 

Vp,p' e U e L{A[p\) 3rt_A[p] G Acc 3w G (0, 1}* {rt^A[p\{w) = p' A w ^ e) 



n 

Clearly, if Ind{A) = 1 then (J = 0 and therefore A has the accessibility 
property. It is easy to prove that if an automaton A has the accessibility property, 

n 

admits a gadget G{p, q) for some p,q G [j Li and Ind{A) is even (hence t/n+i = 

n 

0), then it actually admits a gadget G{p, q) for any p,q € IJ The subsequent 

i=l 

lemma characterizes relevantness of states using the notion of a gadget. 



Lemma 1. Let A he a frontier deterministic stopping automaton of the form 
{Q, S,q, A, f], A) , whose index is even. Then the initial state q is relevant for 
the automaton A if and only if this automaton admits a gadget G{q, q). 



If the automaton A admits a gadget G{q, q) then using the iteration it is easy to 
obtain a tree accepted by A in which the occurrences of the state q are relevant. 
The reverse implication is only slightly more difficult - for the details see da- 
Presently we will prove a lemma, which allows us to transform Rabin’s automa- 

n 

ton into Biichi form if we impose on all states in the set {J Li some condition, 

i=l 



which uses the notion of relevantness. 
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Lemma 2. Let A be a deterministic stopping automaton of the form 
{Q,S,q,A,f2,A), where fl = {(Li, C/i), . . . , (L„, C/„), (0, 17„+i)} and n > 1. If 

n 

for any state s € Ui. s is irrelevant for an automaton -4[s], then there exists 

i—1 

frontier deterministic stopping automaton B = (Q^ Al, q' , Z\', 17', A) with index 1 
equivalent to the automaton A. 

For the proof see cm- 

4 Canonical Decomposition of Deterministic Automaton 

4.1 On Representation of Automata in Form of a Composition 

Consider a sequence of stopping automata ... defined by dependencies: 

Vi G A:} A = {Qi,S,pi,Ai,Qi,Afj (3) 

Vz G {1, . . . , fc} C, = {{L\,UD , . . . , (0, C/;^+i)}. (4) 

Vi, j G {1, . . . ,fc} (i j ^ Qi n Qj = 0) (5) 

We define an automaton denoted by a symbol © . . . © Afc as a stopping 
automaton (Q', , Z\', 17', A'), whose construction is presented below: 

- A' = (UA)\Q' 

The set 17' is a little more tedious to define and comes as a result of the following 
construction: Let m = max{ | i G {1, . . . , fc}} + 1. We change each set Qi 
into a sequence consisting of its elements keeping the same notation: L2i{j) for 
j G {1, . . . , rii + 1}, taking care that the only pair of the form (0, U) is at the 
position m + 1. Now we define a sequence 12' {j) for j G {1, . . . , m}: 

f( U (^iO')li)> U ), for j < TO 

l<l<k l<l<k 

f2 (j) = < 3<^l j<ni (6) 

(0> ( U + lib))) for j = m 

L i<i<fc 

As can be seen, we sum the conditions pointwise except for the last pairs, which 
summed separately form the last condition. Now let 17' be a set of the elements 
of the sequence 12' (i) for i G to}. Note that the above construction is 

not unique. Transforming the set 17^ into a sequence we can arrange its elements 
in many different ways. This ambiguity however will have no influence on the 
proofs and can be easily removed, if we define the set of acceptance conditions 
of a stopping automaton as a sequence. 
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Definition 9. The automaton Ai © ... © Ak described above will be called the 

composition of a sequence 

Note that the following inequalities hold true: 

max{/nc?(f2i) | 1 < t < A: } < Ind(fi') < max{Ind{f2i) |l<t<fc} + l (7) 

Note that the composition is a Biichi automaton if so is each automaton Ai- For 
a particular composition B —Ai © ... © Ak we define a function 
SwitchingStates® , which associates with each automaton Ai = 

Ai, f2i^ Ai) some subset of Qi in the following way: 

k 

SwitchingStates^(^i) = Qi H IJ Aj 

The function SwitchingStates® describes, which states of the given automaton 
are the stopping states of the other components of the composition. In this paper 
often compositions of the form B =Ai © . . . © (where Ai are defined as in 
conditions EE ) will satisfy additional conditions, stated below. 

Definition 10. Assume that there exists a partition of a set {1, . . . , fc} into two 
sets X, Y, which satisfy the conditions: 

— yi G X Ind{Ai) = 1 

— Vt G T^yrt^By 7t G {0, 1}“ (3i G Y (/n(rt,B|7r) flQi yf 0 A/n(rt,Bk) g Q*)) 
^ {3tGX{In{rt,BW)nUl^0)) 

7li 

— yi GY { Ind{Ai) > 1 SwitchingStates^ (Ai) = [J L]) 

i=i 

— yi G Y SwitchingStates^ {Ai) are reachable in the accepting runs of the au- 
tomaton B 

Composition of a sequence of stopping automata is proper if there exists 
a partition X, Y such that the above conditions hold true. 

Intuitively the second condition means: if in some run (not necessarily accepting) 
on some path the automaton B visits states of some automaton from the group 
Y infinitely often, but will never remain in its states forever, then the acceptance 
conditions are satisfied with regard to some automaton from the group X . 
Again assume that B =Ai © ... © A* =(Q, X, — , A, J7, A). Let us fix for some 
i G {1, . . . , A:} an automaton Ai = (Qi, E, Ai, f2i, Af). Moreover let there exist 
an automaton C = (Q', E, Z\', I?', A') such that Q' fl Q = 0 and a function 

/ : Qi — Q' ■ Then for an arbitrary q G Q a substitution C Ai into the 
composition A\® . . .®Ak[q] is a composition Mi © ... © Ai- 1 © C © Ai+\ © . . . 
®Ak[q'\, which we change in the following manner: 

— if q G Qi, we take q' = f{q), otherwise cf = q, 

— all occurrences of states p from the set SwitchingStates®(Mi) are replaced 
by states f{p) 
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4.2 Automaton Graph 



Let A = {Q, S, qo, A, L2, A) denote an arbitrary stopping automaton such that 
12 = {(Li, t/i), {l 2 ,U 2 ), . . . , (L„, C/„), (0, Un+i)} , where: n > 1. We will con- 
struct a directed automaton graph A, which will be denoted by Q_ 4 . Initially, 

n 

we define a directed graph: Ha =(^, E), where F = IJ LiUUn+i and E CVxV 

i=l 

has the following form: 

E = {(x,y) G V X V \ 3t G L{A) G Acc 3w, w' G {0, 1}* 

(rt^A{w) = X A rt^A{w') =y A w < w' A n^Adw, w')) C\V = 0)} 

The above graph will be helpful in defining a directed stopping automaton graph 
A, Qa = (V’,E’). Namely, we define: 

V’ = Con(t/„+i,'H_4) U { {q} I Con(C/„+i, q) = 0 A qGV} , 

E’={(A, Y)gV xV'\3xGX3yGY {x, y) G E}, 



Let us note that considering the character of the above construction and a con- 

n 

dition C/n+ifl \J Li = 0, which holds true, the following dependency is satisfied: 

i=l 



\fx,Y gV {X(iY = 0) A\fx G V' (X n Un+i = 0vxniJ Li = 0) 

i=l 



4.3 Canonical Form of Rabin Automaton 

Lemma 3. Consider an arbitrary deterministic stopping automaton A with in- 
dex greater than 1, {Q, X, A, [2, A) , where Q = {(Li, C/i), . . . , (L„, [/„), 
(0,?7„+i)}. Moreover we require that if its index is even, the automaton does 
not have the accessibility property. Then there exists a sequence 
(k > 2) of deterministic stopping automata whose composition B — Bi (B ... (B 
Bk ={Q\E,—,A', f2',A'), is proper and satisfies conditions: 

— \/i G {1 , . . . , k} Bi = {Qi, E, Si, Ai, B2i, Ai) has the accessibility property, 

— Vi G {l,..,/c} Ind(Bi) < Ind (A) ( if Ind{A) is odd, Ind{Bi) < Ind (A) ), 

— 3/ : Q "a" Q' (Vp e Q (A[p] B[f(p)])) 

Moreover, if C = {(L[, U[), . . . , (L^, U^), (0, then the function f from 

the last condition satisfies a dependency: 

m n 

IjL'c/djA)- (8) 

n m 

(7^e/(U^*)\U L'j A 3 f G {1, . . . , /c} f{p) G B^) ^ Ind{Bi) = 1 (9) 

i=i j=i 



For the proof see HD). 
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Lemma 4. Let A=A\ © . . .©.4^ = (Q, S, q, A, fl, A), where J7 = {(Li, J7i), . . . , 
(L„, Un), (0, Un+i)}, be a proper eomposition of frontier deterministie stopping 
automata. Assume that there exists i G {1 , .. .,k} such that the automaton Ai 
is of the form : {Qi, E, Ai, Qi, Af) and its index is greater than 1. Let: Qi = 
{(L^, . . . , C/^J, (0, Suppose furthermore that there exists a 

proper composition of frontier deterministic stopping automata B — Bi (B ■■■ (B 

and a one-to-one function f from Qi to Q' , which 
satisfies the two conditions [3 E and such that for each p G Qi the automaton 

f 

Ai[p] is equivalent to the automaton B[f{p)]. Then the substitution B Ai into 
the composition Ai(B ■ ■ ■ ©4lfc[g] is proper and equivalent to the composition A. 

For the proof see mg. 

Theorem 1. For an arbitrary deterministic stopping automaton A with the in- 
dex greater than 1 there exist two sequences of stopping automata having the 
accessibility property {Ai)i<i and {Bj)j<k, where I ■ k ^ 0, such that the compo- 
sition © ... © © B\(B. ■ .(BBk [q'] is equivalent to the automaton A for some 

specifically chosen state q' . Furthermore, this composition satisfies the following 
conditions: 

1. for each i G ike automaton Ai is frontier deterministic and its 

index is 1 

2. for each i G /c} the automaton Bi is deterministic, and its index is 

even 

3. Ind{B\) < . . . < Ind{Bk) 

4- for each i G {I, . . . , k} Bi = { Qi, S, -, Ai, fli,Ai), where fii = {{L\,U{), 

rii 

. . . ,{Ll^.,lLfJ}, and for any p,r G |J L® the automaton Bi[p] admits a 
gadget G{p,r). 

Moreover, we can assume that the sequences {Ai)i<i and {Bj)j<k> satisfy an 

n 

additional condition: I < 1 and k < \ IJ Li\. 

2—1 

The representation of the automaton A in the form of a composition satisfying 
the above conditions will be called the canonical form. 

Proof. If the index of the automaton A is odd or it is even, but the automaton 
does not have the accessibility property then according to the lemma |3 there 
exists a sequence of deterministic stopping automata having the accessibility 
property (Ci)i<m, whose composition C = Ci © . . . © Cm[r] is equivalent to the 
automaton A for some state r. Assume that we can find in the set {!,... ,m} 
such j that Ind(Cj) is odd and greater than 1. If we again use the lemma El this 
time for the automaton Cj, and obtain a composition T> — T>i (B ... (B and 
a function f, which establishes equivalence between Cj and the composition T>, 

then by lemmaElwe can construct the substitution D Cj into the composition 
C. Thus we obtain a new proper composition, whose all elements have the ac- 
cessibility property and which is equivalent to the automaton A. This procedure 
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can be repeated as long as among the components of the composition there are 
automata, whose index is even and greater than 1. Let us note however that 
this can be done only finitely many times since according to the lemma 0 each 
time we introduce into the composition automata with smaller indices than the 
index of the replaced automaton. Eventually we obtain a sequence {£i)i<m" of 
deterministic stopping automata having the accessibility property, whose index 
is either even or equal 1 and whose composition is equivalent to the automa- 
ton A. Now let us take an arbitrary automaton from the above composition £i 
with an even index (if there is such). Assume £i = (Q', S, A', 17', A'), where 

n' 

fl' = . . . , (A(j,, C/^,)}. If for each p G [j L'j the state p is irrelevant 

for the automaton £i [p] then by lemma 0 there exists a frontier deterministic 
stopping automaton with index 1 equivalent to £i (the construction from the 
lemma 0 does not depend on the initial state of the initial automaton, so there 
exists a function / establishing an equivalence between the above automata) 
Note that a single-element sequence consisting of a Biichi automaton £' forms 
a proper composition and we can use the lemma 0 constructing a substitution 

f 

£' !->■ £i into the composition 0 . . . 0 £m" ■ We continue this procedure as 
long as it is possible. Next we split the sequence (£i)i<m" into two subsequences 
(Ai)i<i and {Bj)j<k, putting into the first one automata with index 1 and to the 
latter one the automata with an even index, thus we obtain the representation 
of the automaton A in the form of the composition satisfying the conditions of 
the lemma. Let us observe that we can compute the composition of all compo- 
nent automata with index 1 and satisfy the condition Z < 1. Moreover let us 
note that the fourth condition also holds. We know that for any i G {1, . . . , fc} 

rii 

there exists p{i) G IJ L* such that the state p{i) is relevant for the automaton 

Bi\p{i)]. It follows from the lemma0that the automaton Bi[p{i)] admits a gadget 
G{p{i),p{i)). However since the automaton Bi[p{i)] has the accessibility property 

rii 

then it also admits gadgets of the form G{p{i),q) for any q G \J Lj- Finally, 

t=i 

by the definition of the accessibility property and the fact that the index of the 
automaton Bi is even, the fourth condition is satisfied. The remaining parts of 
the thesis are easy observations. 

5 Main Results 

Theorem 2. Let B = Ai (B ■ ■ ■ (B Ai (B Bi G) . . . ® Bk be a canonical form of the 
deterministic stopping tree automaton A l,k ^ Then L(A) is in Biichi class 
if and only if k = 0. 

For the proof see cm. 

Theorem 3. For an arbitrary deterministic stopping automaton A we are able 
to decide if the language accepted by it is in Biichi class. 
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Sketch of the proof. By the preceding theorem to prove the decidability of the 
considered problem we need to show that we can find a canonical form of an 
arbitrary deterministic stopping automaton. Therefore it suffices to prove that 
we are able to give a procedure deciding if a given state is irrelevant for some 
deterministic stopping automaton A. Let us observe that if we had a procedure 
determining if a state is irrelevant for an automaton whose set of stopping states 
is empty, then our task would be completed. It follows from the fact that we can 
transform any stopping automaton into one of the above form. To do this we 
add to the sets of states a new state x, replace all stopping states by this state 
and finally add transitions (x,a,x,x) for any a € S and a condition {0,{x}) 
to the set of conditions. Thus constructed automaton with the empty set of 
stopping states has the following property: states which are irrelevant for it are 
also irrelevant for the initial automaton and vice versa. 

Additionally, we have to be able to construct an automaton graph C/_ 4 . The proof 
of decidability of both of the problems is simple and uses the celebrated Rabin 
theorem on decidability of S2S logic 12]. For a complete proof see Uni- 
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Abstract. Message Sequence Charts (MSCs) are an attractive visual 
formalism widely used to capture system requirements during the early 
design stages in domains such as telecommunication software. A standard 
method to describe multiple communication scenarios is to use message 
sequence graphs (MSCs). A message sequence graph allows the protocol 
designer to write a finite specification which combines MSCs using basic 
operations such as branching choice, composition and iteration. The MSC 
languages described by MSGs are not necessarily regular in the sense of 
IIlM+991 . We characterize here the class of regular MSC languages that 
are MSG-definable in terms of a notion called finitely generated MSC 
languages. We show that a regular MSC language is MSG-definable if 
and only if it is finitely generated. In fact we show that the subclass of 
“bounded” MSGs defined in IAY99I exactly capture the class of finitely 
generated regular MSC languages. 



1 Introduction 

Message sequence charts (MSCs) are an appealing visual formalism often used 
to capture system requirements in the early design stages. They are particu- 
larly suited for describing scenarios for distributed telecommunication software 
IkCChbllTUhTI . They also appear in the literature as timing sequence diagrams, 
message flow diagrams and object interaction diagrams and are used in a num- 
ber of software engineering methodologies In its basic 

form, an MSC depicts a single partially-ordered execution of a distributed sys- 
tem which just describes the exchange of messages between the processes of the 
system. A collection of MSCs is used to capture the scenarios that a designer 
might want the system to exhibit (or avoid). 

Message Sequence Graphs (MSGs) are a nice mechanism for defining collec- 
tions of MSCs. An MSG is a finite directed graph with a designated initial vertex 
and terminal vertex in which each node is labelled by an MSC and the edges 
represent a natural concatenation operation on MSCs. The collection of MSCs 
defined by an MSG consists of all those MSCs obtained by tracing a path in the 
MSG from the initial vertex to the terminal vertex and concatenating the MSCs 
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that are encountered along the path. It is easy to see that this way of defining a 
collection of MSCs extends smoothly to the case where there are multiple termi- 
nal nodes. Throughout what follows we shall assume this extended notion of an 
MSG (that is, with multiple terminal nodes). For ease of presentation, we shall 
also not deal with the so called hierarchical MSGs [AY9fi| . 

Intuitively, not every MSG-definable collection of MSGs can be realized as 
a finite-state device. To formalize this idea we have introduced a notion of a 
regular collection of MSGs and studied its basic properties IIHM-l-991 . Our notion 
of regularity is independent of the notion of MSGs. 

Our main goal in this paper is to pin down the regular MSG languages that 
can be defined using MSGs. We introduce the notion of an MSG language being 
finitely generated. From our results, which we detail below, it follows that a 
regular MSG language is MSG-definable if and only if it is finitely generated. In 
fact we establish the following results. 

As already mentioned, not every MSG defines a regular MSG language. Alur 
and Yannakakis have identified a syntactic property called boundedness and 
shown that the set of all linearizations of the MSGs defined by a bounded MSG 
is a regular string language over an appropriate alphabet of events. It then fol- 
lows easily that, in the present setting, every bounded MSG defines a finitely 
generated regular MSG language. One of our main results here is that the con- 
verse is also true, namely, every finitely generated regular MSG language can be 
defined by a bounded MSG. Since every MSG (bounded or otherwise) defines 
only a finitely generated MSG language, it follows that a regular MSG language 
is finitely generated if and only if it is MSG-definable and, in fact, if and only if 
it is bounded MSG-definable. 

Since the class of regular MSG languages strictly includes the class of finitely 
generated regular MSG languages, one could ask when a regular MSG language 
is finitely generated. We show that this question is decidable. Finally, one can 
also ask whether a given MSG defines a regular MSG language (and is hence 
“equivalent” to a bounded MSG). We show that this decision problem is unde- 
cidable. 

Turning briefly to related literature, a number of studies are available which 
are concerned with individual MSGs in terms of their semantics and proper- 
ties [lA H P96II ;l ;9.'i] . A variety of algorithms have been developed for MSGs in 
the literature — for instance, pattern matching [l,f97IIVIus99IIVI PSDiSj . detection 
of process divergence and non-local choice innsg, and confluence and race con- 
ditions iMznnj. A systematic account of the various model-checking problems 
associated with MSGs and their complexities can be found in [IAY99| . Finally, 
many of our proof techniques make use of results from the theory of Mazur kiewicz 
traces mnnni. 

In the next section we introduce MSGs and regular MSG languages. We then 
introduce message sequence graphs in Section 01 In Section 0) we define finitely 
generated MSG languages and provide an effective procedure to decide whether 
a regular MSG language is finitely generated. Our main result that the class 
of finitely generated regular MSG languages coincides with the class of bounded 
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Fig. 1. An example MSG over {p, q, r}. 



MSG-definable languages is then established. Finally, we sketch why the problem 
of determining if an MSG defines a regular MSG language is undecidable. Due 
to lack of space, we give here only the main technical constructions and sketches 
of proofs. All the details are available in IHM+991 . 

2 Regular MSC Languages 

We fix a finite set of processes (or agents) V and let p,q,r range over V. For 
each p G V we define Ep = {p\q \ p ^ q} \J {p7q \ p ^ q} to he the set of 
communication actions in which p participates. The action p\q is to be read as 
p sends to q and the action p?q is to be read as p receives from q. At our level 
of abstraction, we shall not be concerned with the actual messages that are sent 
and received. We will also not deal with the internal actions of the agents. We 
set E = UpGP ^ range over E. We also denote the set of channels 

by Ch = {{p,q) \ p^ q} and let c, d range over Ch. 

A A'-labelled poset is a structure M = {E, <, A) where {E, <) is a poset and 
X : E ^ E is a labelling function. For e G E we define fe = |e' | e' < e}. For p G 
V and a € 17, we set Ep = {e\ A(e) G Ep} and Ea = {e \ A(e) = a}, respectively. 
For each c G Ch, we define the communication relation Rc = {(e, e') | A(e) = 
p\q,X{e') = qlp and \ieC\Ep\q\ = |4,e' fl i7g?p|}. Finally, for each p G V, we define 
the p-causality relation Rp = {Ep x Ep)n <. 

An MSG (over V) is a finite A7-labelled poset M = {E,<,\) which satisfies 
the following conditions: 

(i) Each Rp is a linear order. 

(ii) If p^q then \Ep\q\ = \Eq9p\. 

(iii) < = {R-p U Rch)* where Rp = [jpeP Rp and Rch = [JceCh Re- 
in diagrams, the events of an MSG are presented in visual order. The events of 

each process are arranged in a vertical line and the members of the relation Rch 
are displayed as horizontal or downward-sloping directed edges. We illustrate 
the idea with an example, shown in Figure [D 

Here V = {p,q,r}. For x G V, the events in E^ are arranged along the 
line labelled (x) with earlier (relative to <) events appearing above the later 
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events. The Rch-edges across agents are depicted by horizontal edges — for in- 
stance 63 R(r,q) 62- The labelling function A is easy to extract from the diagram — 
for example, A(eg) = r\p and A(e2) = qlp- 

We define regular MSC languages in terms of their linearizations. For the 
MSC M = {E,<,X), let Lin{M) = {A(7 t) | tt is a linearization of (E,<)}. By 
abuse of notation, we have used A to also denote the natural extension of A to 
E* . The string p\q rlq qlp qlr r!p plr is a linearization of the MSC in Figure ^ 
We say that a G S* is proper if for every prefix r of tr and every pair 
(p,q) of processes, \r\p\q > |'r|g?p. We say that a is complete if a is proper and 
W\p\q = |o'|(j?p for every pair of processes {p,q). Clearly, any linearization of an 
MSC is a complete. Conversely, every complete sequence is the linearization of 
some MSC. 

Henceforth, we identify an MSC with its isomorphism class. We let XA-p 
be the set of MSCs over V. An MSC language C C Mp is said to regular if 
yj{Lin{M) I M S £} is a regular subset of E* . We note that the entire set M.p 
is not regular by this definition. 

We define L C E* to be a regular string MSC language if there exists a 
regular MSC language C C Mp such that L = (J{Lm(M) | M G £}. As shown 
in IHM+991 . regular MSC languages and regular string MSC languages represent 
each other. Hence, abusing terminology, we will write “regular MSC language” 
to mean “regular string MSC language”. From the context, it should be clear 
whether we are working with MSCs from A4p or complete words over E*. 

Given a regular subset L C E* , we can decide whether L is a regular MSC 
language. We say that a state s in a finite-state automaton is live if there is a path 
from s to a final state. Let A — {S, E, Sin, S, E) be the minimal DFA representing 
L. Then it is not difficult to see that L is a regular MSC language if and only if 
we can associate with each live state s G S', a (unique) channel-capacity function 
/Cs : C/i — )> N which satisfies the following conditions. 

(i) If s G {sira} U F then /Cs(c) = 0 for every c G Ch. 

(ii) If s, s' are live states and 5{s,p\q) = s' then ICs>{{p,q)) — /Cs((Pj <?))+! and 
ICs'{c) = /Cs(c) for every c yf (p, q). 

(iii) If s, s' are live states and S{s,q?p) = s' then Ks{{p,q)) > 0, ICs'{{p,q)) = 

and ICs'{c) = /C^(c) for every c yf (p,q). 

(iv) Suppose S(s,a) = Si and S(si,b) = S2 with a € Ep and b G Eg, p ^ q. li 
(a,b) ^ Com or /Cs((p, g)) > 0, there exists s( such that S{s,b) = s'g and 
(5(s'g, a) = S 2 - (Here and elsewhere Com = {{plq, qlp) \ p yf q}.) 

In the minimal DFA A representing a regular MSC language, if s is a live state 
and a,b ^ E then we say that a and b are independent at s if (a, b) G Com implies 
Ks{{p,q)) > 0 where /C is the unique channel-capacity function associated with 
A and a = plq and b — qlp. 

We conclude this section by introducing the notion of H-bounded MSC lan- 
guages. Let i? G N be a natural number. We say that a word a in E* is B- 
bounded if for each prefix r of cr and for each channel (p, q) G C/i, |r|p!q — |r|g?p < 
B. We say that L C E* is H-bounded if every word a G L is H-bounded. It is 
not difficult to show: 
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Fig. 2. An example MSG. 



Proposition 2.1. Let L he a regular MSC language. There is a bound B € N 
sueh that L is B -bounded. 

3 Message Sequence Graphs 

An MSG allows a protocol designer to write in a standard way jTTTTnT] . a finite 
specification which combines MSCs using operations such as branching choice, 
composition and iteration. Each node is labelled by an MSC and the edges 
represent the natural operation of MSC concatenation. 

To bring out this concatenation operation, we let M\ = (Ei,<i,Ai) and 
M 2 = {E 2 ,< 2 , ^ 2 ) be a pair for MSCs such that Ei and E 2 are disjoint. For 
i € { 1 , 2 }, let i?* and denote the underlying communication and process 

causality relations in Mi. The (asynchronous) concatenation of M\ and M 2 , 
denoted Mi o M 2 , is the MSC (E, <, A) where E = Ei U E 2 , A(e) = Ai(e) if 
e G Ei, i G {1, 2}, and < = {R-p U Rch)* , where Rp = R^iJ R(^VJ {(ei, 62 ) | ei G 
El, 62 G E 2 , A(ei) G Ep, A(e 2 ) G Ep} for p € V, and Rc = Rlu for c G Com. 

A Message Sequence Graph (MSG) is a structure Q = {Q,^,Qin,E,<L>), 
where: 

— <5 is a finite and nonempty set of states. 

— — >■ C Q X Q. 

— Qin C Q is a set of initial states. 

— F C Q is a set of final states. 

— : Q ^ Mp is a (state-)labelling function. 

A path TT through an MSG C/ is a sequence qo^qi^ >qn such that 

(qi-i,qi) G — >■ for j G {l,2,...,n}. The MSC generated by tt is M{tt) = 

Mq o Ml o M 2 o • • • o Mn, where Mi = d>{qi). A path tt = qo^qi^ >qn 

is a run if qo G Qin and qn & E. The language of MSCs accepted by G is 
£{G) = {M{tt) G Mp I 7 T is a run through Q}. 

An example of an MSG is depicted in Figure O It’s not hard to see that the 
MSC language £ defined is not regular in the sense defined in Section 2. To see 
this, we note that £ projected to {plq, Hs} is not a regular string language. 
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(P) (q) (r) 




Fig. 3. The atomic MSC M 2 of the non- finitely-generated language Cinf- 

4 Bounded MSGs and Finitely Generated MSC 
Languages 

Let Xi,X2 ^ -M-p be two sets of MSCs. As usual, o X2 denotes the pointwise 
concatenation of Xi and X2 given by {M | 3 Mi € Xi, M2 € X2 : M = Mi 0M2}. 
For X C Xip, we define = {e}, where e denotes the empty MSC, and for 
* > 0 , o A®. The asynchronous iteration of X is then defined by 

A® = U>o 4 ’L 

Let C C Xip. We say that C is finitely generated if there is a finite set of 
MSCs X C Xip such that £ C A®. 

We first observe that not every regular MSC language is finitely generated. 
Let V = {p, g, r}. Consider the regular expression p!g tr* qlp, where a is the 
sequence plr rip rlq qlr qlp plq. This expression describes an infinite set of 
MSCs Cinf = {Mi}“^o- Figure 0 shows the MSC M2- None of the MSCs in this 
language can be expressed as the concatenation of two or more non-trivial MSCs. 
Hence, this language is not finitely generated. 

Our interest in finitely generated languages stems from the fact that MSGs 
can define only finitely generated MSC languages. We now wish to decide whether 
a regular MSC language is finitely generated. To do this we shall make use of 
the notion of atoms. 

Let M, M' € Xip be nonempty MSCs. Then M' is a component of M in case 
there exist Mi, M2 S Xi such that M = MioM' oM2- We say that M is an atom 
if the only component of M is M itself. Thus, an atom is a nonempty message se- 
quence chart that cannot be decomposed into non-trivial subcomponents. For an 
MSC M we let Comp{M) denote the set {M' \ M' is a component of M}. We let 
Atoms{M) denote the set {M' | M' is an atom and XI' is a component of M}. 
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For an MSC language C C M-p, Comp{C) = {}{Comp{M) | M G £} and 
Atoms{C) = [_}{Atoms{M) \ M G £}. It is clear that the question of deciding 
whether L is finitely generated is equivalent to checking whether Atoms (C) is 
finite. 

Theorem 4.1. Let L he a regular MSC language. It is decidable whether L is 
finitely generated. 



Proof Sketch: Let A = {S, S, Sm, S, F) be the minimum DFA for £. From A, 
we construct a finite family of finite-state automata which together accept the 
linearizations of the MSCs in Atoms {£). It will then follow that £ is finitely 
generated if and only if each of these automata accepts a finite language. We 
sketch the details below. 

We know that for each live state s G S', we can assign a capacity function 
ICs : Ch — >■ N which counts the number of messages present in each channel 
when the state s is reached. We say that s is a zero-capacity state if /Cs(c) = 0 
for each channel c. The following claims are easy to prove. 

Claim 1: Let M be an MSC in Comp{£) (in particular, in Atoms{£)) and w 
be a linearization of M . Then, there are zero-capacity live states s, s' in A such 
that s s'. 

Claim 2: Let M be an MSC in Comp{£). M is an atom if and only if for each 
linearization w of M and each pair (s, s') of zero-capacity live states in A, if 
s s' then no intermediate state visited in this run has zero-capacity. 

Let us say that two complete words w and w' are equivalent, written w ~ w' , 
if they are linearizations of the same MSC. Suppose s — > s' and w ^ w' . 

Then it is easy to see that s s' as well. 

With each pair (s, s') of live zero-capacity states we associate a language 
LAt{s, s'). A word w belongs to lAt{s, s') if and only if w is complete, s s' 

and for each w' ^ w the run s s' has no zero-capacity intermediate states. 
From the two claims proved above, each of these languages consists of all the 
linearizations of some subset of Atoms {£) and the linearizations of each element 
of Atoms{£) is contained in some lAt{s,s'). Thus, it suffices to check for the 
finiteness of each of these languages. 

Let Ls,s' be the language of strings accepted by A when we set the initial 
state to be s and the set of final states to be {s'}. Clearly lAt{s, s') C Lg^s'- We 
now show how to construct an automaton for for lAt{s, s'). 

We begin with A and prune the automaton as follows: 

— Remove all incoming edges at s and all outgoing edges at s'. 

— If t ^ (S) Etnd /C( = 0, remove t and all its incoming and outgoing edges. 

— Recursively remove all states that become unreachable as a result of the 
preceding operation. 



Let B be the resulting automaton. B accepts any complete word w on which 
the run from s to s' does not visit an intermediate zero-capacity state. Clearly, 
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LAt{s,s') C L{B). However, L(B) may also contain linearizations of non-atomic 
MSCs that happen to have no complete prefix. For all such words, we know 
from Claim 2 that there is at least one equivalent linearization on which the run 
passes through a zero-capacity state and which would hence be eliminated from 
L{B). Thus, LAt{s,s') is the ^-closed subset of L{B) and we need to prune B 
further to obtain the automaton for LAt{s,s'). 

Recall that the original DFA A was structurally closed with respect to the 
independence relation on communication actions in the following sense. Suppose 
i5(si,a) = S 2 and 5{s2,b) = ^3 with a, 6 independent at Si. Then, there exists s '2 
such that (5(si,6) = s '2 and J(s 2 ,a) = S3. 

To identify the closed subset of L{B), we look for local violations of this 
“diamond” property and carefully prune transitions. We first blow up the state 
space into triples of the form (si, 32 , 53 ) such that there exist a and a' with 
<5(si,a) = 32 and S(s 2 ,a') = S3. Let S' denote this set of triples. We obtain 
a nondeterministic transition relation S' = {((si, S2, S3), a, (ti, ^2, ^3)) | S2 = 
ti,S3 = t2,S(s2,a) = S3}. Set = {(si, 32,33) € S" | S2 = Si„} and F' = 
{(si,s/,S 2 ) GS'lsfG F}. Let B' = (S',F,S',S,n,F'). 

Consider any state si in B such that a and b are independent at si, <5(si, a) = 
S2, <5(s2, b) = S3 but there is no S2 such that <5(si, b) = s '2 and ^(s^, a) = S3. For 
each such si, we remove all transitions of the form ((t, sq, si), a, (so,si,t')) and 
((C S 2 , S3), (s 2, S3,t')) from B'. We then recursively remove all states which 

become unreachable after this pruning. 

Eventually, we arrive at an automaton C such that L(C) = LAt{s,s'). Since 
C is a finite-state automaton, we can easily check whether L(C) is finite. This 
process is repeated for each pair of live zero-capacity states. □ 

Alur and Yannakakis introduced the notion of boundedness for MSGs. 

They also showed that the set of all linearizations of the MSCs defined by a 
bounded MSG is a regular string language. In the present setting this boils down 
to boundedness of an MSG being a sufficient condition for its MSG language 
to be regular. To state their condition, we first have to define the notion of 
connectedness. 

Let M = (E,<,X) be an MSC. We let CGm, the communication graph of 
M, be the directed graph (7^,i-t) defined as follows: {p,q) G i-t if and only if 
there exists an e G if with A(e) = plq. M is connected if CGm consists of one 
non-trivial strongly connected component and isolated vertices. For an MSC 
language £ C Ai-p we say that £ is connected in case every M G £ is connected. 

Let 0 = (Q,— ^) be an MSG. A loop in t/ is a sequence of edges 
that starts and ends at the same node. We say that Q is bounded if for every 

loop 7T = q— >• >q, the MSC M{tt) is connected. An MSC language £ is a 

hounded MSG-language if there exists a bounded MSG Q with £ = G{Q). 

It is easy to check whether a given MSG is bounded. Clearly, the MSG of 
Figure O is not bounded. One of the main results concerning bounded MSGs 
shown in at once implies: 
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Lemma 4.2. Every bounded MSG-language is a regular MSC language. 

Our main interest in this section is to prove the converse of Lemma E21 

Lemma 4.3. Let C he a finitely generated regular MSC language. Then, L is a 
bounded MSG-language. 

Proof Sketch: Suppose £ is a regular MSC language accepted by the minimal 
DFA A = {S, E, Sin, 5, F). Let Atoms{L) — {oi, 02 , , Om}- For each atom Oi, 
fix a linearization Ui € Lin{ai). Define an auxiliary DFA B = {S'^ , Atoms{L) , Sm, 
S, F) as follows: 

— S^ is the set of states of A which have zero-capacity functions. 

— F = F 

— 6{s, Oi) = s' iff 6{s, Ui) = s' in A. (Note that u, u' £ Lin{ai) implies 5{s, u) = 
S{s,u'), so s' is fixed independent of the choice of Ui £ Lin{ai).) 

Thus, B accepts the (regular) language of atoms corresponding to C{A). We 
can define a natural independence relation I a on atoms as follows: atoms Oi and 
Qj are independent if and only if the set of active processes in Ui is disjoint from 
the set of active processes in aj. (The process p is active in the MSC {E, <, A) 
if Ep is non-empty.) 

It follows that L{B) is a regular Mazurkiewicz trace language over the trace 
alphabet {Atoms{£), Ia). As usual, for w £ Atoms{£)* , we let [u>] denote the 
equivalence class of w with respect to I a. 

We now fix a strict linear order ^ on Atoms {C). This induces a (lexico- 
graphic) total order on words over Atoms {C). Let Lex C Atoms {£)* be given 
by: w £ Lex iff w is the lexicographically least element in [w]. 

For a trace language L over (Atoms(£), J^), let lex{L) denote the set LPiLex. 

Remark 4.4 ( [DR95J , Sect. 6.3.1) 

(i) If L is a regular trace language over { Atoms {L), I a), then lex{L) is a regular 
language over Atoms{L). Moreover, L = {[u>] | w £ lex{L)). 

(ii) If W 1 WW 2 £ Lex, then w £ Lex. 

(Hi) If w is not a connected trace, then ww ^ Lex. 

^From (i) we know that lex{L{B)) is a regular language over Atoms{C). Let 
C = {S' , Atoms{C), s'^„,S' , F') be the DFA over Atoms{C) obtained by eliminat- 
ing the (unique) dead state, if any, from the minimal DFA for lex{L{B)). It is 
easy to see that an MSC M belongs to £ if and only if it can be decomposed 
into a sequence of atoms accepted by C. Using this fact, we can derive an MSG 
Q from C such that C{Q) = £. We define Q = {Q, — >■, Qin, F, <I>) as follows: 

— Q = S' X {Atoms{C) U {e}). 

^ A trace is said to be connected if, when viewed as a labelled partial order, its Hasse 
diagram consists of a single connected component. See for a more formal 

definition. 
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Qin — {(Sj„, e)}- 

— (s, 6 )— &') iff S'{s,b') = s'. 

— F' = F X Atoms{C). 

— ^(s, h) = b. 

Clearly Q is an MSG and the MSC language that it defines is £. We need to 

show that 0 is bounded. To this end, let 7 t= (s, 6)— 6i)— t> >{sn, bn)^{s, b) 

be a loop in Q. We need to establish that the MSC M (tt) = 6i o • • • o o 6 defined 
by this loop is connected. Let w = 61&2 • ■ ■ bnb. 

Consider the corresponding loop s si ^ s in C. Since 

every state in C is live, there must be words W\,W 2 over Atoms (C) such that 
wiw^W 2 G lex{L{B)) for every k > 0. 

^From (ii) of Remark 14.41 G Lex. This means, by (iii) of Remark 14.41 
that w describes a connected trace over (Ttoms(£), /^). From this, it is not 
difficult to see that the underlying undirected graph of the communication graph 
CGm{-k) = consists of a single connected component C CV and isolated 

processes. We have to argue that the component C is, in fact, strongly connected. 
We show that if C is not strongly connected, then the regular MSC language C 
is not R-bounded for any B gN, thus contradicting Proposition |^] 

Suppose that the underlying graph of C is connected but C not strongly 
connected. Then, there exist two processes p,q G C such that p ^ q, but there 
is no path from q back to p in CGm(-k)- For fc > 0, let = (if, <,A) be 

the MSC corresponding to the fc-fold iteration M{n) o M{tt) o • • • o Since 

k times 

pi-^ q in CGm{tt), it follows that there are events labelled p\q and qlp in 
Moreover, since there is no path from q back to p in GGM(-K)-i we can conclude 
that in M(7r)*, for each event e with A(e) = plq, there is no event labelled qlp 
in 4-6. This means that admits a linearization with a prefix r(, which 

includes all the events labelled plq and excludes all the events labelled qlp, so 
that |r|p!g - |r|q?p > k. 

By Proposition 12 . R since £ is a regular MSC language, there is be a bound 
R G N such that every word in £ is R-bounded — that is, for each v G C, for 
each prefix t of u and for each channel {p,q) G Ch, \r\p\q — |r|g?p < B. Recall 
that wiw^W 2 G lex{L{B)) for every k > Q. Fix linearizations v\ and V 2 of the 
atom sequences w\ and W 2 , respectively. Then, for every k > 0, Uk = 'Ciu(,U2 G £ 
where is the linearization of defined earlier. Setting k to be B+1, we 

find that Uk admits a prefix Tk = vir'f. such that |Tfe|pi,j — \Tk\q-fp > B+1, which 
contradicts the R-boundedness of £. 

Hence, it must be the case that C is a strongly connected component, which 
establishes that the MSG Q we have constructed is bounded. □ 

Putting together Lemmas 10 and 01 we obtain the following characteriza- 
tion of MSG-definable regular MSC languages. 

Theorem 4.5. Let C be a regular MSC language. Then the following statements 
are equivalent: 



Message Sequence Graphs and Finitely Generated MSG Languages 685 




(p) (9) (r) (s) 




Fig. 4. An non-bounded MSG whose language is regular. 



(i) C is finitely generated. 

(n) C is a bounded MSG-language. 

(Hi) C is MSG -definable. 

It is easy to see that boundedness is not a necessary condition for regularity. 
Consider the MSG in Figure 0 which is not bounded. It accepts the regular 
MSC language M o (M + M')®. 

Thus, it would be useful to provide a characterization of the class of MSGs 
representing regular MSC languages. Unfortunately, the following result shows 
that there is no (recursive) characterization of this class. 

Theorem 4.6. The problem of deeiding whether a given MSG represents a reg- 
ular MSG language is undecidable. 



Proof Sketch: It is known that the problem of determining whether the trace- 
closure of a regular language L C A* with respect to a trace alphabet {A, I) is 
also regular is undecidable (HM. We reduce this problem to the problem of 
checking whether the MSC language defined by an MSG is regular. 

Let (A, I) be a trace alphabet. We fix a set of processes P and the associated 
communication alphabet S and encode each letter a by an MSC over V 
such that the communication graph CGMa is strongly connected. Moreover, if 
(a, 6 ) G I, then the sets of active processes of Ma and Mi, are disjoint. The 
encoding ensures that we can construct a finite-state automaton to parse any 
word over S and determine whether it arises as the linearization of an MSC of 
the form Ma, o Ma^ o ■ ■ ■ o Ma,.. . If so, the parser can uniquely reconstruct the 
corresponding word 0102 ... over A. 

Let A be the minimal DFA corresponding to a regular language L over A. 
We construct an MSG Q from A as described in the proof of Lemma ^31 Given 
the properties of our encoding, we can then establish that the MSC language 
L{Q) is regular if and only if the trace-closure of L is regular, thus completing 
the reduction. □ 
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Abstract. We postulate that a distribution is pseudorandom if it cannot be told 
apart from the uniform distribution by any efficient procedure. This yields a ro- 
bust definition of pseudorandom generators as efficient deterministic programs 
stretching short random seeds into longer pseudorandom sequences. Thus, pseu- 
dorandom generators can be used to reduce the randomness-complexity in any 
efficient procedure. Pseudorandom generators and computational difficulty are 
closely related: loosely speaking, each can be efficiently transformed into the 
other. 



1 Introduction 

The second half of this century has witnessed the development of three theories of ran- 
domness, a notion which has been puzzling thinkers for ages. The first theory (cf. [9]), 
initiated by Shannon [33], is rooted in probability theory and is focused at distributions 
that are not perfectly random. Shannon’s Information Theory characterizes perfect ran- 
domness as the extreme case in which the information content is maximized (and there 
is no redundancy at all).' Thus, perfect randomness is associated with a unique distri- 
bution - the uniform one. In particular, by definition, one cannot generate such perfect 
random strings from shorter random strings. 

The second theory (cf. [22,23]), due to Solomonov [34], Kolmogorov [21] and 
Chaitin [6], is rooted in computability theory and specifically in the notion of a universal 
language (equiv., universal machine or computing device). It measures the complexity 
of objects in terms of the shortest program (for a fixed universal machine) that generates 
the object.^ Like Shannon’s theory, Kolmogorov Complexity is quantitative and perfect 
random objects appear as an extreme case. Interestingly, in this approach one may 
say that a single object, rather than a distribution over objects, is perfectly random. 
Still, Kolmogorov’s approach is inherently intractable (i.e., Kolmogorov Complexity is 

* In general, the amount of information in a distribution D is defined as — ^ D{x) logj D{x). 
Thus, the uniform distribution over strings of length n has information measure n, and any 
other distribution over n-bit strings has lower information measure. Also, for any function 
/ : {0, 1}" — >■ {0, 1}"* with n < m, the distribution obtained by applying / to a uniformly 
distributed n-bit string has information measure at most n, which is strictly lower than the 
length of the output. 

^ For example, the string 1" has Kolmogorov Complexity 0(1) -flogj n (by virtue of the program 
“print n ones” which has length dominated by the binary encoding of n. In contrast, a simple 
counting argument shows that most n-bit strings have Kolmogorov Complexity at least n (since 
each program can produce only one string). 



U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 687-704, 2000. 
© Springer- Verlag Berlin Heidelberg 2000 
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uncomputable), and - by definition - one cannot generate strings of high Kolmogorov 
Complexity from short random strings. 

The third theory, initiated by Blum, Goldwasser, Micali and Yao [17, 4, 37], is rooted 
in complexity theory and is the focus of this survey. This approach is explicitly aimed at 
providing a notion of perfect randomness that nevertheless allows to efficiently generate 
perfect random strings from shorter random strings. The heart of this approach is the 
suggestion to view objects as equal if they cannot be told apart by any efficient procedure. 
Consequently a distribution that cannot be efficiently distinguished from the uniform 
distrihution will be considered as being random (or rather called pseudorandom). Thus, 
randomness is not an “inherent” property of objects (or distributions) but is rather relative 
to an observer (and its computational abilities). To demonstrate this approach, let us 
consider the following mental experiment. 

Alice and Bob play head OR tail in one of the following four ways. In all 
of them Alice flips a coin high in the air, and Boh is asked to guess its outcome 
before the coin hits the floor. The alternative ways differ by the knowledge Bob 
has before making his guess. In the first alternative. Bob has to announce his 
guess before Alice flips the coin. Clearly, in this case Bob wins with probability 
1/2. In the second alternative. Bob has to announce his guess while the coin is 
spinning in the air. Although the outcome is determined in principle by the mo- 
tion of the coin. Bob does not have accurate information on the motion and thus 
we believe that also in this case Bob wins with probability 1/2. The third alter- 
native is similar to the second, except that Bob has at his disposal sophisticated 
equipment capable of providing accurate information on the coin’s motion as 
well as on the environment effecting the outcome. However, Bob cannot process 
this information in time to improve his guess. In the fourth alternative, Bob’s 
recording equipment is directly connected to a powerful computer programmed 
to solve the motion equations and output a prediction. It is conceivable that in 
such a case Bob can improve substantially his guess of the outcome of the coin. 

We conclude that the randomness of an event is relative to the information and computing 
resources at our disposal. Thus, a natural concept of pseudorandomness arises - a distri- 
bution is pseudorandom if no efficient procedure can distinguish it from the uniform 
distribution, where efficient procedures are associated with (probabilistic) polynomial- 
time algorithms. 



Orientation Remarks 

We consider finite objects, encoded by binary finite sequences called strings. When we 
talk of distributions we mean discrete probability distributions having a finite support 
that is a set of strings. Of special interest is the uniform distribution, that for a length 
parameter n (explicit or implicit in the discussion), assigns each n-bit string a; G {0, 1}" 
equal probability (i.e., probability 2“"). We will colloquially speak of “perfectly random 
strings” meaning strings selected according to such a uniform distribution. 

We associate efficient procedures with probabilistic polynomial-time algorithms. 
An algorithm is called polynomial-time if there exists a polynomial p so that for any 
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possible input x, the algorithm runs in time bounded by p(|a;|), where |x| denotes the 
length of the string x. Thus, the running time of such algorithm grows moderately as a 
function of the length of its input. A probabilistic algorithm is one that can take random 
steps, where, without loss of generality, a random step consists of selecting which of 
two predetermined steps to take next so that each possible step is taken with probability 
1/2. These choices are called the algorithm’s internal coin tosses. 



Organization, Acknowledgment, and Further Details 

Sections 2 and 3 provide a basic treatment of pseudorandom generators (as briefly 
discussed in the abstract). The rest of this survey goes somewhat beyond; In Section 4 
we treat pseudorandom functions, and in Section 5 we further discuss the practical and 
conceptual significance of pseudorandom generators. In Section 6 we discuss alternative 
notions of pseudorandom generators, viewing them all as special cases of a general 
paradigm. The survey is based on [11, Chap. 3], and the interested reader is referred to 
there for further details. 



2 The Notion of Pseudorandom Generators 

Loosely speaking, a pseudorandom generator is an efficient program (or algorithm) that 
stretches short random strings into long pseudorandom sequences. The latter sentence 
emphasizes three fundamental aspects in the notion of a pseudorandom generator: 

1 . Efficiency: The generator has to be efficient. As we associate efficient computations 
with polynomial-time ones, we postulate that the generator has to be implementable 
by a deterministic polynomial-time algorithm. 

This algorithm takes as input a string, called its seed. The seed captures a bounded 
amount of randomness used by a device that “generates pseudorandom sequences.” 
The formulation views any such device as consisting of a deterministic procedure 
applied to a random seed. 

2. Stretching: The generator is required to stretch its input seed to a longer output 
sequence. Specifically, it stretches n-bit long seeds into £(n) -bit long outputs, where 
^{n) > n. The function Miscalled the stretching measure (or stretching function) 
of the generator. 

3. Pseudorandomness: The generator’s output has to look random to any efficient 
observer. That is, any efficient procedure should fail to distinguish the output of a 
generator (on a random seed) from a truly random sequence of the same length. 
The formulation of the last sentence refers to a general notion of computational 
indistinguishability, which is the heart of the entire approach. 

2.1 Computational Indistinguishability 

Intuitively, two objects are called computationally indistinguishable if no efficient proce- 
dure can tell them apart. As usual in complexity theory, an elegant formulation requires 
asymptotic analysis (or rather a functional treatment of the running time of algorithms 
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in terms of the length of their input). ^ Thus, the objects in question are inhnite sequences 
of distributions, where each distribution has a hnite support. Such a sequence will be 
called a distribution ensemble. Typically, we consider distribution ensembles of the 
form where for some function f : N— N, the support of each I9„ is a subset 

of {0, Furthermore, typically t will be a positive polynomial. For such Dn, we 
denote by e^Dn the process of selecting e according to distribution I9„. Consequently, 
for a predicate P, we denote by Pre„.,£)^ [P{s)] the probability that P{e) holds when e 
is distributed (or selected) according to I9„. 

Definition 1 (Computational Indistinguishability [17, 37]): Two probability ensembles, 
are caWed computationally indistinguishable (f/orarayprah- 
abilistic polynomial-time algorithm A, for any positive polynomial p, and for all suffi- 
ciently large n ’s 

I [A{x) = 1] - Pry„.y„ [A{y) = 1] | < 

P[n) 

The probability is taken over Xn (resp., Yn) as well as over the coin tosses of algorithm 

A. 

A couple of comments are in place. Firstly, we have allowed algorithm A (called a 
distinguisher) to be probabilistic. This makes the requirement only stronger, and seems 
essential to several important aspects of our approach. Secondly, we view events occuring 
with probability that is upper bounded by the reciprocal of polynomials as negligible. 
This is well-coupled with our notion of efficiency (i.e., polynomial-time computations): 
An event that occurs with negligible probability (as a function of a parameter n), will 
also occur with negligible probability if the experiment is repeated for poly(n)-many 
times. 

We note that computational indistinguishability is a strictly more liberal notion than 
statistical indistinguishability (cf. [37, 15]). An important case is the one of distributions 
generated by a pseudorandom generator as defined next. 

2.2 Basic Definition and Initial Discnssion 

We are now ready for the main dehnition. Recall that a stretching function, f : N— N, 
satisfies £{n) > n for all n. 

Definition 2 (Pseudorandom Generators [4, 37]): A deterministic polynomial-time al- 
gorithm G is called a pseudorandom generator if there exists a stretching function, 
e-.N^N, so that the following two probability ensembles, denoted {GnjngN 
are computationally indistinguishable 

1. Distribution Gn is defined as the output of G on a uniformly selected seed in {0, 1}”. 

2. Distribution Rn is defined as the uniform distribution on {0, 

^ We stress that the asymptotic (or functional) treatment is not essential to this approach. One 
may develop the entire approach in terms of inputs of fixed lengths and an adequate notion of 
complexity of algorithms. However, such an alternative treatment is more cumbersome. 
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That is, letting Um denote the uniform distribution over {0, 1}™, we require that for any 
probabilistic polynomial-time algorithm A, for any positive polynomial p, and for all 
sufficiently large n ’s 

|Pi's~c/„[^(G'(s)) = 1] - Pr^^c/,(„) [^(?") = 1] I < 



Thus, pseudorandom generators are efficient (i.e., polynomial-time) deterministic pro- 
grams that expand short randomly selected seeds into longer pseudorandom bit se- 
quences, where the latter are defined as computationally indistinguishable from truly 
random sequences by efficient (i.e., polynomial-time) algorithms. It follows that any 
efficient randomized algorithm maintains its performance when its internal coin tosses 
are substituted by a sequence generated by a pseudorandom generator. That is. 

Construction 3 (typical application of pseudorandom generators): Let A be a prob- 
abilistic polynomial-time algorithm, and p{n) denote an upper bound on its random- 
ness complexity. Let A(x, r) denote the output of A on input x and coin tosses se- 
quence r G {0 , Let G be a pseudorandom generator with stretching function 
f : N — >■ N. Then Aq is a randomized algorithm that on input x, proceeds as follows. 
It sets k = fc(|a;|) to be the smallest integer such that I(k) > p(|a;|), uniformly selects 
s G {0, 1}^, and outputs A{x, r), where r is the p{\x\)-bit long prefix ofG{s). 

It can be shown that it is infeasible to find long x’s on which the input-output behavior 
of Aq is noticeably different from the one of A, although Ac may use much fewer coin 
tosses than A. That is 



Proposition 4 Let A and G be as above. For any algorithm D, let Aa c{x) denote the 
discrepancy, as judged by D, in the behavior of A and Aq on input x. That is. 



^a,d{x) = \Prr~Up(„~,[D{x,A{x,r)) = 1] - Pr Ac{x, s)) = 1] | 

where the probabilities are taken over the Um ’s as well as over the coin tosses ofD. Then 
for every pair of probabilistic polynomial-time algorithms, a finder F and a distinguisher 
D, every positive polynomial p and all sufficiently long n ’s 



Pr 






1 

p{n) 



where |T"(1") | = n and the probability is taken over the coin tosses of F. 



The proposition is proven by showing that a triplet {A, F,D) violating the claim can 
be converted into an algorithm D' that distinguishes the output of G from the uniform 
distribution, in contradiction to the hypothesis. Analogous arguments are applied when- 
ever one wishes to prove that an efficient randomized process (be it an algorithm as 
above or a multi-party computation) preserves its behavior when one replaces true ran- 
domness by pseudorandomness as defined above. Thus, given pseudorandom generators 
with large stretching function, one can considerably reduce the randomness complexity 
in any efficient application. 
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2.3 Amplifying the Stretch Function 

Pseudorandom generators as defined above are only required to stretch their input a bit; 
for example, stretching n-bit long inputs to (n + l)-bit long outputs will do. Clearly, 
generator of such moderate stretch function are of little use in practice. In contrast, 
we want to have pseudorandom generators with an arbitrary long stretch function. By 
the efficiency requirement, the stretch function can be at most polynomial. It turns out 
that pseudorandom generators with the smallest possible stretch function can he used 
to construct pseudorandom generators with any desirable polynomial stretch function. 
(Thus, when talking about the existence of pseudorandom generators, we may ignore 
the stretch function.) 

Theorems [14]: Let G be a pseudorandom generator with stretch function £{n) = 
n + 1, and i' be any polynomially-bounded stretch function, that is polynomial-time 
computable. Let G\ (x) denote the \ x\-bit long prefix ofG{x), and G 2 {x) denote the last 
bitofG{x) (i.e., G{x) = Gi{x) G 2 (x)). Then 

G'(s) (71(72 • • • 0-r'(|s|) , 

where xq = s, (Ji = G 2 {xi-i) and Xi = Gi{xi-i), for i = 1, ...,f'(|s|) 
is a pseudorandom generator with stretch function £' . 

Proof Sketch: The theorem is proven using the hybrid technique (cf. [10, Sec. 3.2.3]): 
One considers distributions (for i = 0,...,f(n)) defined by Pi(n)-i{Un'^), 
where and Un'^ are independent uniform distributions (over {0, 1}® and {0, 1}", 
respectively), and Pj{x) denotes the j-bit long prefix of G'{x). The extreme hybrids 
correspond to G'{Un) and (7^(„), whereas distinguishability of neighboring hybrids can 
be worked into distinguishability of G{Un) and (7„+i. Loosely speaking, suppose one 
could distinguish from Then, using Pj(s) = G 2 (s)Pj_i(Gi(s)) (for j > 1), 
this means that one can distinguish Lf^ = {U-^\G 2 {Un^), P(^£(^n)-i)-i{Gi{Un^))) 
from \ Pi(^n)-{i-i-i)iUn ^))- Incorporating the generation of 

and the evaluation of P^(„)_j_i into the distinguisher, one could distinguish {Gi{U^^), 

G2{Un'^)) = G{U„) from {Un \ = P„+i, in contradiction to the pseudoran- 

domness of G. I 



3 How to Construct Pseudorandom Generators 

The known constructions transform computation difficulty, in the form of one-way func- 
tions (defined below), into pseudorandomness generators. Loosely speaking, a polyno- 
mial-time computable function is called one-way if any efficient algorithm can invert 
it only with negligible success probability. For simplicity, we consider only length- 
preserving one-way functions. 

Definition 6 (one-way function): A one-way function, /, is a polynomial-time com- 
putable function such that for every probabilistic polynomial-time algorithm A', every 
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positive polynomial p{-), and all sufficiently large n’s 

Pr^~( 7 „ [A\f{x))ef-\f{x))] < ^ 
where Un is the uniform distribution over {0,1}". 

Popular candidates for one-way functions are based on the conjectured intractability of 
integer factorization (cf. [30] for state of the art), the discrete logarithm problem (cf. [31] 
analogously), and decoding of random linear code [16]. The infeasibility of inverting / 
yields a weak notion of unpredictability: Let bi{x) denotes the z**' bit of x. Then, for 
every probabilistic polynomial-time algorithm A (and sufficiently large n), it must he the 
case that Pri_a,[A(z, /(x)) 6i(x)j > l/2zz, where the probability is taken uniformly 
over z € {1, ..., zzj and x € {0, 1}". A stronger (and in fact strongest possible) notion 
of unpredictability is that of a hard-core predicate. Loosely speaking, a polynomial-time 
computable predicate b is called a hard-core of a function / if any efficient algorithm, 
given f{x), can guess b{x) only with success probability that is negligible better than 
half. 

Definition 7 (hard-core predicate [4]): A polynomial-time computable predicate b : 
{0, 1}* — >■ {0, 1} is called a hard-COre of a function f if for every probabilistic 
polynomial-time algorithm A' , every positive polynomial p{-), and all sufficiently large 
n’s 

Pr^^uJAff{x)) = b{x)] < ^ + 

Clearly, if 6 is a hard-core of a 1-1 polynomial-time computable function / then / must 
be one-way."^ It turns out that any one-way function can be slightly modified so thaf if 
has a hard-core predicate. 

Theorem 8 (A generic hard-core [13]): Let f be an arbitrary one-way function, and let 
g be defined by g{x, r) (/(x), r), where |x| = |r|. Letb{x, r) denote the inner-product 
mod 2 of the binary vectors x and r. Then the predicate b is a hard-core of the function 
9 - 

See proof in [1 1, ApdxC.2]. We are now ready to present constructions of pseudorandom 
generators. 

3.1 The Preferred Presentation 

In view of Theorem 5, we may focus on constructing pseudorandom generators with 
stretch function ^{n) = n-\- Such a construction is presented next. 

Proposition 9 (A simple construction of pseudorandom generators): Let b be a hard- 
core predicate of a polynomial-time computable 1-1 function f . Then, G{s) /(s) b{s) 

is a pseudorandom generator. 

Functions that are not 1-1 may have hard-core predicates of information-theoretic nature; but 
these are of no use to us here. For example, functions of the form f{a,x) = 0f'{x) (for 
cr € {0, 1}) have an “information theoretic” hard-core predicate b{a, x) = a. 
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Proof Sketch: Clearly the |s|-bit long prefix of G(s) is uniformly distributed (since / 
is 1-1 and onto {0, 1}!'*'). Hence, the proof boils down to showing that distinguishing 
f{s)b{s) from f{s)a, where ct is a random bit, yields contradiction to the hypothesis 
that 6 is a hard-core of / (i.e., that b{s) is unpredictable from /(s)). Intuitively, such a 
distinguisher also distinguishes f{s)b{s) from f{s)b{s), where a = I — a, and so yields 
an algorithm for predicting b{s) based on /(s). | 

In a sense, the key point in the above proof is showing that the unpredictability of 
the output of G implies its pseudorandomness. The fact that (next bit) unpredictability 
and pseudorandomness are equivalent in general is proven explicitly in the alternative 
presentation below. 

3.2 An Alternative Presentation 

The above presentation is different but analogous to the original construction of pseudo- 
random generators suggested by Blum and Micali [4] : Given an arbitrary stretch function 
f:N^N,a 1-1 one-way function / with a hard-core b, one defines 

G{s) =*' b{xo)b{xi) ■ ■ ■ 6(a;^(|s|)_i) , 

where xq = s and Xi = f{xi-i) for i = 1, ...,f(|s|) — 1. The pseudorandomness of G 
is established in two steps, using the notion of (next bit) unpredictability. An ensemble 
is called unpredictable if any probabilistic polynomial-time machine obtain- 
ing a prefix of fails to predict the next bit of with probability non-negligibly 
higher than 1 / 2 . 

Step 1: One first proves that the ensemble where C/„ is uniform over 

{0, 1}", is (next-bit) unpredictable (from right to left) [4]. 

Loosely speaking, if one can predict from 6 (xi+i) • • • 6 (a;£(| 5 |)_i) then one can 

predict b{xi) given f{xi) (i.e., by computing Xi+i, ..., x^(|s|)_i, and so obtaining 
b{xi+i) ■ ■ ■ 6 (a;^(|s|))). But this contradicts the hard-core hypothesis. 

Step 2: Next, one uses Yao’s observation by which a (polynomial-time constructible) 
ensemble is pseudorandom if and only if it is (next-bit) unpredictable (cf. [ 10 , 
Sec. 3.3.4]). 

Clearly, if one can predict the next bit in an ensemble then one can distinguish this 
ensemble from the uniform ensemble (which in unpredictable regardless of com- 
puting power). However, here we need the other direction which is less obvious. 
Still, one can show that (next bit) unpredictability implies indistinguishability from 
the uniform ensemble. Specifically, consider the following “hybrid” distributions, 
where the f**' hybrid takes the first i bits from the questionable ensemble and the 
rest from the uniform one. Thus, distinguishing the extreme hybrids implies distin- 
guishing some neighboring hybrids, which in turn implies next-bit predictability (of 
the questionable ensemble). 

3.3 A General Condition for the Existence of Pseudorandom Ggenerators 

Recall that given any one-way 1-1 function, we can easily construct a pseudorandom 
generator. Actually, the 1-1 requirement may be dropped, but the currently known con- 
struction - for the general case - is quite complex. Still we do have. 
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Theorem 10 (On the existence of pseudorandom generators [18]): 

Pseudorandom generators exist if and only if one-way functions exist. 

To show that the existence of pseudorandom generators imply the existence of one-way 
functions, consider a pseudorandom generator G with stretch function f(n) = 2n. For 

def 

x,y G {0, 1}”, define f{x, y) = G(x), and so / is polynomial-time computable (and 
length-preserving). It must be that / is one-way, or else one can distinguish GfUn) from 
U 2 n by trying to invert and checking the result: Inverting / on its range distribution 
refers to the distribution whereas the probability that U 2 n has inverse under / 

is negligible. 

The interesting direction is the construction of pseudorandom generators based on 
any one-way function. In general (when / may not be 1-1) the ensemble f{Un) may not 
be pseudorandom, and so Construction 9 (i.e., G(s) = /(s) 6(s) , where 6 is a hard-core of 
/) cannot be used directly. One idea of [18] is to hash f{Un) to an almost uniform string 
of length related to its entropy, using Universal Hash Functions [5]. (This is done after 
guaranteeing, that the logarithm of the probability mass of a value of /(!/„) is typically 
close to the entropy of /(C/„).)^ But “hashing /(!/„) down to length comparable to the 
entropy” means shrinking the length of the output to, say, n' < n. This foils the entire 
point of stretching the n-bit seed. Thus, a second idea of [18] is to compensate for the 
n — n' loss by extracting these many bits from the seed [/„ itself. This is done by hashing 
Un, and the point is that the (n — n' -f l)-bit long hash value does not make the inverting 
task any easier. Implementing these ideas turns out to be more difficult than it seems, 
and indeed an alternative construction would be most appreciated. 



4 Pseudorandom Functions 

Pseudorandom generators allow to efficiently generate long pseudorandom sequences 
from short random seeds. Pseudorandom functions (defined below) are even more pow- 
erful: They allow efficient direct access to a huge pseudorandom sequence (which is 
infeasible to scan bit-by-bit). Put in other words, pseudorandom functions can replace 
truly random functions in any efficient application (e.g., most notably in cryptography). 
That is, pseudorandom functions are indistinguishable from random functions by effi- 
cient machines that may obtain the function values at arguments of their choice. (Such 
machines are called oracle machines, and if M is such machine and / is a function, then 
M^{x) denotes the computation of M on input x when M’s queries are answered by 
the function /.) 

Definition 11 (pseudorandom functions [12]): A pseudorandom function (ensem- 
ble), with length parameters : N — is a collection of functions F {/„ : 

satisfying 

^ Specifically, given an arbitrary one way function /', one first constructs / by taking a “direct 
product” of sufficiently many copies of /'. For example, for xi, ..., x „2 G {0, 1}", we let 
f{xi,...,x„ 2 ) '^= f'{xi),...,f'{x„ 2 ). 
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- (efficient evaluation): There exists an efficient (deterministic) algorithm that given 
a seed, s, and an £i)(\s\)-bit argument, x, returns the ivi{\s\)-bit long value fs(x). 
(Thus, the seed s is an “effective description” of the function fs.) 

- (pseudorandomness): For every probabilistic polynomial-time oracle machine, 
M, for every positive polynomial p and all sufficiently large n’s 

I Pr/^F„ [M^(l") = 1] - Prp,.«„ [M'’(l") = 1] | < ^ 

p{n) 

where Fn denotes the distribution on fa & F obtained by selecting s uniformly 
in jo, 1|", and R„ denotes the uniform distribution over all functions mapping 



Suppose, for simplicity, that fnin) = n and ^R(n) = 1. Then a function uniformly 
selected among 2" functions (of a pseudorandom ensemble) presents an input-output 
behavior that is indistinguishable in poly(n)-time from the one of a function selected at 
random among all the 2^ Boolean functions. Contrast this with the 2" pseudorandom 
sequences, produced by a pseudorandom generator, that are computationally indistin- 
guishable from a sequence selected uniformly among all the many sequences. 

Still pseudorandom functions can be constructed from any pseudorandom generator. 

Theorem 12 (How to construct pseudorandom functions [12]): Let G be a pseudoran- 
dom generator with stretching function £{n) = 2n. Let Go(s) (resp., Gi (s)) denote the 
first {resp., last) |s| bits in G{s), and 

='g.,^,(---G.,(G.,(s))---) 

Then, the function ensemble {fa ■ {0, — >■ {0, 1}I '*l}sg{o.i}*, where fa{x) G^{s), 
is pseudorandom with length parameters f'D(n-) = f-Kin) = n. 

The above construction can be easily adapted to any (polynomially-bounded) length 
parameters ^D) : N— > N. We mention that pseudorandom functions have been used to 

derive negative results in computational learning theory [35] and in complexity theory 
(e.g., in the context of Natural Proofs [32]). 



5 Further Discussion of Pseudorandom Generators 

In this section we discuss some of the applications and conceptual aspects of pseudo- 
random generators. 

5.1 The Applicability of Pseudorandom Generators 

Randomness is playing an increasingly important role in computation: It is frequently 
used in the design of sequential, parallel and distributed algorithms, and is of course 
central to cryptography. Whereas it is convenient to design such algorithms making free 
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use of randomness, it is also desirable to minimize the usage of randomness in real 
implementations (since generating perfectly random bits via special hardware is quite 
expensive). Thus, pseudorandom generators (as defined above) are a key ingredient in an 
“algorithmic tool-hox” - they provide an automatic compiler of programs written with 
free usage of randomness into programs that make an economical use of randomness. 

Indeed, “pseudo-random number generators” have appeared with the hrst computers. 
However, typical implementations use generators that are not pseudorandom according 
to the above dehnition. Instead, at best, these generators are shown to pass SOME ad-hoc 
statistical test (cf. [20]). We warn that the fact that a “pseudo-random number generator” 
passes some statistical tests, does not mean that it will pass a new test and that it is 
good for a future (untested) application. Furthermore, the approach of subjecting the 
generator to some ad-hoc tests fails to provide general results of the type stated above 
(i.e., of the form “for all practical purposes using the output of the generator is as 
good as using truly unbiased coin tosses”). In contrast, the approach encompassed in 
Definition 2 aims at such generality, and in fact is tailored to obtain it: The notion of 
computational indistinguishability, which underlines Dehnition 2, covers all possible 
efficient applications postulating that for all of them pseudorandom sequences are as 
good as truly random ones. 

Pseudorandom generators and functions are of key importance in Cryptography. 
They are typically used to establish private-key encryption and authentication schemes 
(cf. [11, Sec. 1.5.2 & 1.6.2]). For example, suppose that two parties share a random 
n-bit string, s, specifying a pseudorandom function (as in Dehnition 11), and that s 
is unknown to the adversary. Then, these parties may send encrypted messages to one 
another by XORing the message with the value of fs at a random point. That is, to encrypt 
TO G {0, the sender uniformly selects r € {0, and sends (r,TO©/s(r)) 

to the receiver. Note that the security of this encryption scheme relies on the fact that, 
for every computationally-feasible adversary (not only to adversary strategies that were 
envisioned and tested), the values of the function fs on such r’s look random. 

5.2 The Intellectual Contents of Pseudorandom Generators 

We shortly discuss some intellectual aspects of pseudorandom generators as dehned 
above. 

Behavioristic versus Ontological. Our dehnition of pseudorandom generators is based 
on the notion of computational indistinguishability. The behavioristic nature of the latter 
notion is best demonstrated by confronting it with the Kolmogorov-Chaitin approach to 
randomness. Loosely speaking, a string is Kolmogorov-random if its length equals the 
length of the shortest program producing it. This shortest program may be considered the 
“true explanation” to the phenomenon described by the string. A Kolmogorov-random 
string is thus a string that does not have a substantially simpler (i.e., shorter) explanation 
than itself. Considering the simplest explanation of a phenomenon may be viewed as 
an ontological approach. In contrast, considering the effect of phenomena (on an ob- 
server), as underlying the definition of pseudorandomness, is a behavioristic approach. 
Furthermore, there exist probability distributions that are not uniform (and are not even 
statistically close to a uniform distribution) but nevertheless are indistinguishable from 




698 



O. Goldreich 



a uniform distribution by any efficient procedure [37, 15]. Thus, distributions that are 
ontologically very different, are considered equivalent by the hehavioristic point of view 
taken in the definitions above. 

A relativistic view of randomness. Pseudorandomness is dehned above in terms of 
its observer. It is a distribution that cannot be told apart from a uniform distribution 
by any efficient (i.e. polynomial-time) observer. However, pseudorandom sequences 
may be distinguished from random ones by infinitely powerful computers (not at our 
disposal!). Specifically, an exponential-time machine can easily distinguish the output 
of a pseudorandom generator from a uniformly selected string of the same length (e.g., 
just by trying all possible seeds). Thus, pseudorandomness is subjective to the abilities 
of the observer. 

Randomness and Computational Difficulty. Pseudorandomness and computational diffi- 
culty play dual roles: The definition of pseudorandomness relies on the fact that putting 
computational restrictions on the observer gives rise to distributions that are not uni- 
form and still cannot be distinguished from uniform. Furthermore, the construction of 
pseudorandom generators rely on conjectures regarding computational difficulty (i.e., 
the existence of one-way functions), and this is inevitable: given a pseudorandom gen- 
erator, we can construct one-way functions. Thus, (non-trivial) pseudorandomness and 
computational hardness can be converted back and forth. 

6 A General Paradigm 

Pseudorandomness as surveyed above can be viewed as an important special case of a 
general paradigm. A generic formulation of pseudorandom generators consists of spec- 
ifying three fundamental aspects - the stretching measure of the generators; the class 
of distinguishers that the generators are supposed to fool (i.e., the algorithms with re- 
spect to which the computational indistinguishability requirement should hold); and the 
resources that the generators are allowed to use (i.e., their own computational complex- 
ity). In the above presentation we focused on polynomial-time generators (thus having 
polynomial stretching measure) that fool any probabilistic polynomial-time observers. 
A variety of other cases are of interest too, and we briefly discuss some of them. 

6.1 Weaker Notions of Computational Indistinguishability 

Whenever the aim is to replace random sequences utilized by an algorithm with pseudo- 
random ones, one may try to capitalize on knowledge of the target algorithm. Above we 
have merely used the fact that the target algorithm runs in polynomial-time. However, 
if the application utilizes randomness in a restricted way then feeding it with sequences 
of lower “randomness-quality” may do. For example, if we know that the algorithm 
uses very little work-space then we may use weaker forms of pseudorandom genera- 
tors, which may be easier to construct, that suffice to fool bounded-space distinguishers. 
Similarly, very weak forms of pseudorandomness suffice for randomized algorithms that 
can he analyzed when only referring to some specific properties of the random sequence 
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they uses (e.g., pairwise independence of elements of the sequence). In general, weaker 
notions of computational indistinguishahility such as fooling space-bounded algorithms, 
constant-depth circuits, and even specihc tests (e.g., testing pairwise independence of 
the sequence), arise naturally, and generators producing sequences that fool such dis- 
tinguishers are useful in a variety of applications. Needless to say that we advocate a 
rigorous formulation of the characteristics of such applications and rigorous construc- 
tions of generators that fool the type of distinguishers that emerge. We mention some 
results of this type. 

Fooling space-bounded algorithms. Here we consider space-bounded randomized algo- 
rithms that have on-line access to their random-tape, and so the potential distinguishers 
have on-line access to the input that they inspect. Two main results in this area are: 

Theorem 13 {TZC C sc [26, 27]): Any language decidable by a log-space randomized 
algorithm is decidable by a polynomial-time deterministic algorithm of poly-logarithmic 
space complexity. 

Theorem 14 (The Nisan-Zuckerman Generator [29]): Any language decidable by a 
linear-space polynomial-time randomized algorithm is decidable by a randomized al- 
gorithm of the same complexities that uses only a linear number of coin tosses. 

Both theorems are actually special cases of more general results that refer to arbitrary 
computations (rather than to decision problems). 

Fooling constant-depth circuits. As a special case, we consider the problem of approx- 
imately counting the number of satisfying assignments of a DNF formula. Put in other 
words, we wish to generate “pseudorandom” sequences that are as likely to satisfy a 
given DNF formula as uniformly selected sequences. Nisan showed that such “pseu- 
dorandom” sequences can be produced using seeds of polylogarithmic length [25]. By 
trying all possible seeds, one can approximately count the number of satisfying assign- 
ments of a DNF formula in deterministic quasi-polynomial time. 

Pairwise independent generators. We consider distributions of n-long sequences over a 
hnite set S. For f G N, such a distribution is called f-wise independent if its projection 
on any t coordinates yields a distribution that is uniform over S^. We focus on the case 
where [S'! is a prime power, and so S can be identified with a finite field F. In such a 
case, given 1”, 1* and a representation of the field F so that \F\ > n, one can generated 
a f-wise independent distribution over F" in polynomial-time, using a random seed 
of length t ■ log 2 |F|. Specifically, the seed is used to specify a polynomial of degree 
t — 1 over F, and the z**' element in the output sequence is the result of evaluating this 
polynomial at the z*^ field element (cf. [2, 7]). 

Small-bias generators. Here, we consider distributions of zz-long sequences over {0,1}. 
For e G [0, 1], such a distribution is called e-bias if for every non-empty subset /, the 
exclusive-or of the bits at locations I equals 1 with probability at least (1 — e) • | and at 
most (1 -I- e) • i. 
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Theorem 15 (small-bias generators [24]): Given n and e, one can generate an e-bias 
distribution over {0,1}" in poly(n, log(l/e))-rime, using a random seed of length 
0(log(n/e)). 

See [11, Sec. 3.6.2] for more details. 

Samplers (and hitters) and extractors (and dispersers). Here we consider an arbitrary 
function v : {0, 1}" — >■ [0, 1], and seeks a universal procedure for approximating the 
average value of u, denoted V (i.e., V 2“" v{x)). Such a (randomized) procedure 

is called a sampler. It is given three parameters, n, e and S, as well as oracle access to 
u, and needs to output a value u so that Pr[|i7 — i/\ > e] < 6. A hitter is given the 
parameters, n, e and <5, as well as a value v so that ||a; : ^(x) = u|| > e • 2" (and 
oracle access to u), and is required to find, with probability at least 1 — <), a preimage x 
so that u(x) = V. A sampler is called non-adaptive if it determines its queries based 
only on its internal coin tosses (i.e., independently on the answers obtained for previous 
queries); it is called oblivious if its output is a predetermined function of the sequence 
of oracle answers; and it is called averaging if its output equals the average value of 
the oracle answers. (In a sense, a non-adaptive sampler corresponds to a “pseudorandom 
generator” that produces at random a sequence of queries that, with high probability, 
needs to be “representative” of the average value of any function.) We focus on the 
randomness and query complexities of samplers, and mention that any sampler yields a 
hitter with identical complexities. 

Theorem 16 (The Median-of-Averages Sampler [3]): There exists a polynomial-time 
(oblivious) sampler of randomness complexity 0(n -\- log(l/5)) and query complexity 
0(e~^ log(l/<5)). Specifically, the sampler outputs the median value among 0(log(l/ 6)) 
values, where each of these values is the average ofO(e~^) distinct oracle answers.^ 

The randomness complexity can he further reduced to n -\- 0(log(l/eJ)), and both 
complexities are optimal up-to a constant multiplicative factor; see [11, Sec. 3.6.4]. 
Averaging samplers are closely related to extractors, but the study of the latter tends 
to focus more closely on the randomness complexity (and allow query complexity that 
is polynomial in the above). ^ A function E : (0, 1}” x (0, 1}* — >■ (0, 1}™ is called a 
(A:, e)-extractoriffor any random variable A sothatmaxa,{Pr[A = a;]} < 2“^ it holds 
that the statistical difference between E(X, Uf) and Um is at most e, where Ut and Um 
are independently and uniformly distributed over (0, 1}* and (0, 1}*", respectively. (An 
averaging sampler of randomness complexity r(m, e, 5) and query complexity q(m,e,6) 
corresponds to an extractor in which (the yet unspecihed parameters are) n = r(m, e, 6), 
t = log 2 q{m, e, J), and k = n — log 2 (l/^).) A landmark in the study of extractors is 
the following 

Theorem 17 (Trevisan’s Extractor [36]): For any a,b>0, let k(n) = n“ and m(n) = 
[fc(n)^“*']. Fort(n) = O(logn) ande(n) > l/poly(n), there exists a polynomial-time 

* Each of the 0(log(l/ 5)) sequences of 0(e“^) queries is produced by a pairwise independent 
generator, and the seeds used for these different sequences are generated by a random walk on 
an expander graph (cf. [1] and [11, Sec. 3.6.3]). 

’’ The relation between hitters and dispersers is analogous. 
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computable family of functions {£'„ : {0, 1}" x {0, — >• {0, 

En is an {k{n), e{n)) -extractor 

The theorem is proved by reducing the construction of extractors to the construction of 
certain pseudorandom generators (considered in the next subsection). The reduction is 
further discussed at the end of the next subsection. 



6.2 Alternative Notions of Generator Efficiency 

The above discussion has focused on one aspect of the pseudorandomness question - 
the resources or type of the observer (or potential distinguisher). Another important 
question is at what cost can pseudorandom sequences be generated (from much shorter 
seeds, assuming this is at all possible). Throughout this survey we have required the 
generation process to be at least as efficient as the efficiency limitations of the dis- 
tinguisher.* This seems indeed “fair” and natural. Allowing the generator to be more 
complex (i.e., use more time or space resources) than the distinguisher seems unfair 
(and is typically unreasonable in the context of cryptography), but still yields interesting 
consequences in the context of “de-randomization” (i.e., transforming randomized al- 
gorithms into equivalent deterministic algorithms (of slightly higher complexity)). For 
example, one may consider generators working in time exponential in the length of the 
seed. As observed by Nisan and Wigderson [28], in some cases we lose nothing by 
being more liberal (i.e., allowing exponential-time generators). To see why, we consider 
a typical de-randomization argument, proceeding in two steps: First one replaces the 
true randomness of the algorithm by pseudorandom sequences generated from much 
shorter seeds, and next one goes deterministically over all possible seeds and looks for 
the most frequent behavior of the modified algorithm. Thus, in such a case the deter- 
ministic complexity is anyhow exponential in the seed length. The benefit of allowing 
exponential-time generators is that constructing exponential-time generators may be 
easier than constructing polynomial-time ones. A typical result in this vein follows. 

Theorem 18 (De-randomization of BPP [19] (building upon [28])): Suppose that there 
exists a language L G S having almost-everywhere exponential circuit complexity^ 
Then, BVV = V. 

Proof Sketch: Underlying the proof is a construction of a pseudorandom generator due 
to Nisan and Wigderson [25, 28]. This construction utilizes a predicate computable in 
exponential-time but unpredictable, even to within a particular exponential advantage, 
by any circuit family of a particular exponential size. (The crux of [19] is in supplying 

* If fact, we have required the generator to be more efficient than the distinguisher: The former 
was required to be a fixed polynomial-time algorithm, whereas the latter was allowed to be any 
algorithm with polynomial running time. 

^ We say that L is in f if there exists an exponential time algorithm for deciding L; that is, 
the running-time of the algorithm on input x is at most By saying that L has almost- 

everywhere exponential circuit complexity we mean that there exists a constant b > 0 such 
that, for all but finitely many k’s, any circuit Ck that correctly decides L on {0, 1}*^ has size at 
least 2***^. 
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such a predicate, given the hypothesis.) Given such a predicate the generator works by 
evaluating the predicate on exponentially-many subsequences of the bits of the seed so 
that the intersection of any two subsets is relatively small. Thus, for some constant 
6 > 0 and all k’s, the generator stretches seeds of length k into sequences of length 2***^ 
that (as loosely argued below) cannot be distinguished from truly random sequences by 
any circuit of size 2^^ . The de-randomization of BVP proceeds by setting the seed-length 
to be logarithmic in the input length, and utilizing the above generator. 

The above generator fools circuits of the stated size, even when these circuits are 
presented with the seed as auxiliary input. (These circuits are smaller than the run- 
ning time of the generator and so they cannot just evaluate the generator on the given 
seed.) The proof that the generator fools such circuits refers to the characterization of 
pseudorandom sequences as unpredictable ones. Thus, one proves that the next bit in 
the generator’s output cannot be predicted given all previous bits (as well as the seed). 
Assuming that a small circuit can predict the next bit of the generator, we construct 
a circuit for predicting the hard predicate. The new circuit incorporates the best (for 
such prediction) augmentation of the input to the circuit into a seed for the generator 
(i.e., the bits not in the specific subset of the seed are fixed in the best way). The key 
observation is that all other bits in the output of the generator depend only on a small 
fraction of the input bits (i.e., recall the small intersection clause above), and so circuits 
for computing these other bits have relatively small size (and so can be incorporated in 
the new circuit). Using all these circuits, the new circuit forms the adequate input for 
the next-bit predicting circuit, and outputs whatever the latter circuit does. | 

Connection to extractors. Trevisan’s construction [36] adapts the computational frame- 
work underlying the Nisan-Wigderson Generator [28] to the information-theoretic con- 
text of extractors. His adaptation is based on two key observations. The first observation 
is that the generator itself uses a (supposedly hard) predicate as a black-box. Trevisan’s 
construction utilizes a “random” predicate which is encoded by the first input to the ex- 
tractor. For example, the n-bit input may encode a predicate on log 2 n bits in the obvious 
manner. The second input to the extractor, having length t = 0(log n), will be used as 
the seed to the resulting generator (defined by using this random predicate in a black- 
box manner). The second key observation is that the proof of indistinguishability of the 
generator provides a black-box procedure for computing the underlying predicate when 
given oracle access to a distinguisher. Thus, any subset S C {0, 1}™ of the possible 
outputs of the extractor gives rise to a relatively small set Ps of predicates, so that for 
each value x € {0, 1}" of the first input to the extractor, if S “distinguishes” the output 
of the extractor (on a random second input) from the uniform distribution then one of 
the predicates in Pg equals the predicate associated with x. It follows that for every set 
S, the set of possible first inputs for which the probability that the extractor hits S does 
not approximate the density of S is small. This establishes the extraction property. 



These subsets have size linear in the length of the seed, and intersect on a constant fraction of 
their respective size. Furthermore, they can be determined within exponential-time. 
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Abstract. We study contention-resolution protocols for multiple-access 
channels. We show that every backoff protocol is transient if the arrival 
rate. A, is at least 0.42 and that the capacity of every backoff protocol 
is at most 0.42. Thus, we show that backoff protocols have (provably) 
smaller capacity than full-sensing protocols. Finally, we show that the 
corresponding results, with the larger arrival bound of 0.531, also hold 
for every acknowledgement-based protocol. 



1 Introduction 

A multiple-access channel is a broadcast channel that allows multiple users to 
communicate with each other by sending messages onto the channel. If two or 
more users simultaneously send messages, then the messages interfere with each 
other (collide), and the messages are not transmitted successfully. The channel is 
not centrally controlled. Instead, the users use a contention-resolution protocol 
to resolve collisions. Thus, after a collision, each user involved in the collision 
waits a random amount of time (which is determined by the protocol) before 
re-sending. 

Following previous work on multiple-access channels, we work in a time- 
slotted model in which time is partitioned into discrete time steps. At the be- 
ginning of each time step, a random number of messages enter the system, each 
of which is associated with a new user which has no other messages to send. 
The number of messages that enter the system is drawn from a Poisson distribu- 
tion with mean A. During each time step, each message chooses independently 
whether to send to the channel. If exactly one message sends to the channel 
during the time step, then this message leaves the system and we call this a 
success. Otherwise, all of the messages remain in the system and the next time 

*A full version appears at http://www.dcs. Warwick. ac.uk/'leslie/papers/gjkp.ps. 
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step is started. Note that when a message sends to the channel this may or may 
not result in a success, depending on whether or not any other messages send to 
the channel. 

The quality of a protocol can be measured in several ways. Typically, one 
models the execution of the protocol as a Markov chain. If the protocol is good 
(for a given arrival rate A), the corresponding Markov chain will be recurrent 
(with probability 1, it will eventually return to the empty state in which no 
messages are waiting). Otherwise, the chain is said to be transient (and we also 
say that a protocol is transient). Note that transience is a very strong form of 
instability. In particular, if we focus on any finite set of “good” states then if 
the chain is transient, the probability of visiting these states at least N times 
during the infinite run of the protocol is exponentially small in N. (This follows 
because the relevant Markov chain is irreducible and aperiodic.) 

Another way to measure the quality of a protocol is to measure its capacity. 
A protocol is said to achieve throughput X if, when it is run with input rate A, 
the average success rate is A. The capacity of the protocol 0 is the maximum 
throughput that it achieves. 

The protocols that we consider in this paper are acknowledgement-based pro- 
tocols. In the acknowledgement-based model, the only information that a user 
receives about the state of the system is the history of its own transmissions. An 
alternative model is the full-sensing model, in which every user listens to the 
channel at every step, regardless of whether it sends during the stepQ 

One particularly simple and easy-to-implement class of acknowledgement- 
based protocols is the class of backoff protocols. A backoff protocol is a sequence 

of probabilities po,Pi, If a message has sent unsuccessfully i times before a 

time-step, then with probability pi, it sends during the time-step. Otherwise, it 
does not send. Kelly and MacPhee in, m, m gave a formula for the critical 
arrival rate, A* , of a backoff protocol, which is the minimum arrival rate for 
which the expected number of successful transmissions that the protocol makes 
is finite @ 

Perhaps the best-known backoff protocol is the binary exponential backoff 
protocol in which pi = 2“®. This protocol is the basis of the Ethernet protocol of 
Metcalfe and Boggs HH0 Kelly and MacPhee showed that the critical arrival rate 
of this protocol is In 2. Thus, if A > In 2, then binary exponential backoff achieves 
only a finite number of successful transmissions (in expectation). Aldous 
showed that the binary exponential backoff protocol is not a good protocol for 

^ In practice, it is possible to implement the full-sensing model when there is a single 
channel, but this becomes increasingly difficult in situations where there are multiple 
shared channels, such as optical networks. Thus, acknowledgement-based protocols 
are sometimes preferable to full-sensing protocols. For work on contention-resolution 
in the multiple-channel setting, see |H1. 

^ If A > A*, then the expected number of successes is finite, even if the protocol runs 
forever. They showed that the critical arrival rate is 0 if the expected number of 
times that a message sends during the first t steps is a;(logt). 

® There are several differences between the “real-life” Ethernet protocol and “pure” 
binary exponential backoff, but we do not describe these here. 
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any positive arrival rate A. In particular, it is transient and the expected number 
of successful transmissions in t steps is o{t). MacPhee m posed the question 
of whether there exists a backoff protocol which is recurrent for some positive 
arrival rate A. 

In this paper, we show that there is no backoff protocol which is recurrent 
for A > 0.42. (Thus, every backoff protocol is transient if A > 0.42.) Also, every 
backoff protocol has capacity at most 0.42. As far as we know, our result is the 
first proof showing that backoff protocols have smaller capacity than full-sensing 
protocols. In particular, Mosely and Humblet m have discovered a full-sensing 
protocol with capacity 0.487760 Finally, we show that no acknowledgement- 
based protocol is recurrent for A > 0.530045. 

1.1 Related Work 

Backoff protocols and acknowledgement-based protocols have also been studied 
in an n-user model, which combines contention-resolution with queueing. In this 
model, it is assumed that n users maintain queues of messages, and that new 
messages arrive at the tails of the queues. At each step, the users use contention- 
resolution protocols to try to send the messages at the heads of their queues. 
It turns out that the queues have a stabilising effect, so some protocols (such 
as “polynomial backoff” ) which are unstable in our model US! are stable in the 
queueing model HH. We will not describe queueing-model results here, but the 
reader is referred to 121 , m, HH, M- 

Much work has gone into determining upper bounds on the capacity that can 
be achieved by a full-sensing protocol. The current best result is due to Tsybakov 
and Likhanov m who have shown that no protocol can achieve capacity higher 
than 0.568. (For more information, see 0, E2-) 

1.2 Improvements 

We choose A = 0.42 in order to make the proof of Lemma0 as simple as possible. 
The lemma seems to be true for A down to about 0.41 and presumably the 
parameters A and B could be tweaked to get A slightly smaller. 

2 Markov Chain Background 

An irreducible aperiodic Markov chain X = {Xq, Ai, . . .} with a countable state 
space f2 (see cni) is recurrent if it returns to its start state with probability 1. 
That is, it is recurrent if for some state i (and therefore, for all i), Prob[A( = 
i for some t > 1 | Aq = i] = 1. Otherwise, X is said to be transient. X is 
positive recurrent (or ergodic) if the expected number of steps that it takes 

^ Mosely and Humblet’s protocol is a “tree protocol” in the sense of Capetanakis 0 
and Tsybakov and Mikhailov m- For a simple analysis of the protocol, see 123 - 
Vvedenskaya and Pinsker have shown how to modify Mosely and Humblet’s protocol 
to achieve an improvement in the capacity (in the seventh decimal place) m- 
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before returning to its start state is finite. We use the following theorems which 
we take from |^. 

Theorem 1. (Fayolle, Malyshev, Menshikov) A time-homogeneous irreducible 
aperiodic Markov chain X with countable state space fl is not positive recurrent 
if there is a function f with domain f2 and there are constants C , d such that 

1. there is a state x with f{x) > C , and a state x with f{x) < C, and 

2. E[f{Xi) — fiXo) I Xq = a;] > 0 for all x with f{x) > C , and 

3. E[ \f{Xi) — /(Xo)| I Xq = x] < d for every state x. 

Theorem 2. (Fayolle, Malyshev, Menshikov) A time-homogeneous irreducible 
aperiodic Markov chain X with countable state space FI is transient if there are 
a positive function f with domain FI and positive constants C , d, e such that 

1. there is a state x with f{x) > C , and a state x with f{x) < C , and 

2. E[f{Xi) — /(ATg) I Xq = x]> £ for all x with f{x) > C, and 

3. if \f{x) — f{y)\ > d then the probability of moving from x to y in a single 
move is 0. 

3 Stochastic Domination and Monotonicity 

Suppose that X is a Markov chain and that the (countable) state space F2 of the 
chain is a partial order with binary relation <. If A and B are random variables 
taking states as values, then B dominates A if and only if there is a joint sample 
space for A and B in which the value of A is always less than or equal to the value 
of B. Note that there will generally be other joint sample spaces in which the 
value of A can exceed the value of B. Nevertheless, We write A < B to indicate 
that B dominates A. We say that X is monotonic if for any states x < x' , the 
next state conditioned on starting at x' dominates the next state conditioned on 
starting at x. (Formally, {Xi \ Xq = x') dominates {Xi \ Xq = x).) 

When an acknowledgement-based protocol is viewed as a Markov chain, the 
state is just the collection of messages in the system. (Each message is identified 
by the history of its transmissions.) Thus, the state space is countable and it 
forms a partial order with respect to the subset inclusion relation C (for mul- 
tisets). We say that a protocol is deletion resilient (Z| if its Markov chain is 
monotonic with respect to the subset-inclusion partial order. 

Observation 3. Every acknowledgement-based protocol is deletion resilient. 

As we indicated earlier, we will generally assume that the number of mes- 
sages entering the system at a given step is drawn from a Poisson process with 
mean A. However, it will sometimes be useful to consider other message-arrival 
distributions. If / and I' are message-arrival distributions, we write I < I' to 
indicate that the number of messages generated under I is dominated by the 
number of messages generated under 

Observation 4. If the acknowledgement-based protocol P is recurrent under 
message-arrival distribution I' and I < I' then P is also recurrent under I . 
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4 Backoff Protocols 

In this section, we will show that there is no backoff protocol which is recur- 
rent for A > 0.42. Our method will be to use the “drift theorems” in Sectional 
Let ■ be a backoff protocol. Without loss of generality, we can assume 

Po = 1, since we can ignore new arrivals until they first send. Let A = 0.42. Let 
X be the Markov chain described in Section 0 which describes the behaviour 
of the protocol with arrival rate A. First, we will construct a potential function 
(Lyapounov function) / which satisfies the conditions of Theorem QJ that is, a 
potential function which has a bounded positive drift. We will use Theorem E 
to conclude that the chain is not positive recurrent. Next, we will consider the 
behaviour of the protocol under a truncated arrival distribution and we will use 
Theorem0to show that the protocol is transient. Using Observation 0 (domina- 
tion), we will conclude that the protocol is also transient with Poisson arrivals at 
rate A or higher. Finally, we will show that the capacity of every backoff protocol 
is at most 0.42. 

We now define some parameters of a state x. Let k{x) denote the number 
of messages in state x. If k{x) = 0, then p{x) = r{x) = u{x) = 0. Otherwise, 
let mi, . . . denote the messages in state x, with send probabilities pi > 

■ ■ • ^ Pk(x)- Let p{x) = Pi and let r{x) denote the probability that at least one 
of m 2 , . . . ,mf;(x) sends on the next step. Let u{x) denote the probability that 
exactly one of m 2 , . . . , mfe( 2 ,) sends on the next step. Clearly u(x) < r(x). If 
p(x) < r(x) then we use the following (tighter) upper bound for u(x), which we 
prove in the full version. 

Lemma 1. If p{x) < r(x) then u{x) < ■ 

Let S{x) denote the probability that there is a success when the system is 
run for one step starting in state x. (Recall that a success occurs if exactly one 
message sends during the step. This single sender might be a new arrival, or it 
might be an old message from state x.) Let 

g{r,p) = e"^[(l - r)p + {1 - p)mm{r, + (1 -p)(l “ 0^]- 

We now have the following corollary of Lemma [U 
Corollary 1. For any state x, S{x) < g{r{x),p{x)). 

Let s{x) denote the probability that at least one message in state x sends 
on the next step. That is, s(a;) is the probability that at least one existing 
message in x sends. New arrivals may also send. There may or may not be 
a success. (Thus, if x is the empty state, then s{x) = 0.) Let A = 0.9 and 
B = 0.41. For every tt G [0, 1], let c(7t) = max(0, —Att + B). For every state x, 
let f{x) = k{x) + c(s(a;)). The function / is the potential function alluded to 
earlier, which plays a leading role in Theorems0and|2J To a first approximation, 
f{x) counts the number of messages in the state x, but the small correction term 
is crucial. Finally, let 

h{r,p) = X — g{r,p) — [1 — e“'^(l — p)(l -r)(l-|- A)] c(r-|-p — r'p) -|-e“^p(l — r)c(r). 
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Now we have the following. 

Observation 5. For any state x, E[ \f{Xi) — /(Xo)| | Xq = x] <1 + B. 

Lemma 2. For any state x, E[f{Xi) — f{Xo) \ Xq = x] > h{r{x),p{x)). 

Proof. The result follows from the following chain of inequalities, each link of 
which is justified below. 

E[f{Xi) - f{Xo) I Xo = ^] = A - 5(^) + if[c(s(Xi)) | Xq = x] - c(s(x)) 

> A - g(r(x),p(x)) + E[c{s{Xi)) | Xq = x] - c(s(x)) 

> A - g(r(x),p(x)) + e“^(l -p(x))(l - r(x))(l + A)c(s(x)) 

+ e~^p{x){l — r(x))c(r(x)) — c(s(x)) 

= Hr{x),p{x)). 

The first inequality follows from Corollary □ The second comes from substituting 
exact expressions for c(s(Xi)) whenever the form of Xi allows it, and using the 
bound c(s(Xi)) > 0 elsewhere. If none of the existing messages sends and there 
is at most one arrival, then c(s(Xi)) = c(s(x)), giving the third term; if message 
TOi alone sends and there are no new arrivals then c(s(Xi)) = c(r(x)), giving the 
fourth term. The final equality uses the fact that s(x) = p(x) + r(x) — p{x)r{x) . 

□ 



Lemma 3. For any r G [0, 1] and p G [0, 1], h{r,p) > 0.003. 

Proof. Figure ^contains a (Mathematica-produced) plot of —h{r,p) over the 
range r G [0,1], p G [0,1]. The plot suggests that —h{r,p) is bounded below 
zero. The proof of the lemma (which is in the full version of the paper) involves 
evaluating certain polynomials at about 40 thousand points, and we did this 
using Mathematica. □ 

We now have the following theorem. 

Theorem 6. No backoff protocol is positive recurrent when the arrival rate is 
A = 0.42. 

Proof. This follows from Theorem d Observation El and Lemmas El and El The 
value C in Theorem dean be taken to be 1 and the value d can be taken to be 
1 + B. □ 

Now we wish to show that every backoff protocol is transient for A > 0.42. 
Once again, fix a backoff protocol po,pi, . . . with pq = 1. Notice that our potential 
function / almost satisfies the conditions in Theorem El The main problem is 
that there is no absolute bound on the amount that / can change in a single step, 
because the arrivals are drawn from a Poisson distribution. We get around this 
problem by first considering a truncated-Poisson distribution, Tm,\, iu which the 
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Fig. 1. —h{r,p) over the range r G [0, l],p G [0, 1]. 



probability of r inputs is jr\ (as for the Poisson distribution) when r < M, 

but r = M for the remaining probability. By choosing M sufficiently large we 
can have E\Tm,>\ arbitrarily close to A. Using methods similar to those used in 
the proof of Theorem 6 (but using Theorem 2 instead of Theorem 1) we obtain 
Lemma 4 which in turn (by Observation 4) implies Theorem 7. Lemmas 2 and 3 
can also be used (together with deletion resilience (Observation 3)) to show that 
the capacity of every backoff protocol is at most 0.42, so we obtain Theorem 8. 
For details, see the full version. 

Lemma 4. Every backoff protocol is transient for the input distribution Tm,x 
when A = 0.42 and A' = E[Tm,\] > A — 0.001. 



Theorem 7. Every backoff protocol is transient under the Poisson distribution 
with arrival rate A > 0.42. 



Theorem 8. The capacity of every backoff protocol is at most 0.42. 

5 Acknowledgement-Based Protocols 

We will prove that every acknowledgement-based protocol is transient for all 
A > 0.531; see Theorem 9 for a precise statement of this claim. 
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An acknowledgement-based protocol can be viewed a system which, at ev- 
ery step t, decides what subset of the old messages to send. The decision is a 
probabilistic one dependent on the histories of the messages held. As a technical 
device for proving our bounds, we introduce the notion of a “genie”, which (in 
general) has more freedom in making these decisions than a protocol. 

Since we only consider acknowledgement-based protocols, the behaviour of 
each new message is independent of the other messages and of the state of the 
system until after its first send. This is why we ignore new messages until their 
first send - for Poisson arrivals this is equivalent to the convention that each 
message sends at its arrival time. As a consequence, we impose the limitation on 
a genie, that each decision is independent of the number of arrivals at that step. 

A genie is a random variable over the natural numbers, dependent on the 
complete history (of arrivals and sends of messages) up to time t—1, which gives 
a natural number representing the number of (old) messages to send at time t. It 
is clear that for every acknowledgement-based protocol there is a corresponding 
genie. However there are genies which do not behave like any protocol, e.g., a 
genie may give a cumulative total number of “sends” up to time t which exceeds 
the actual number of arrivals up to that time. 

We prove a preliminary result for such “unconstrained” genies, but then we 
impose some constraints reflecting properties of a given protocol in order to 
prove our final results. 

Let be the number of arrivals and the genie’s send value, respec- 

tively, at step t. It is convenient to introduce some indicator variables to express 
various outcomes at the step under consideration. We use io,ii for the events of 
no new arrival, or exactly one arrival, respectively, and go, gi for the events of no 
send and exactly one send from the genie. The indicator random variable S(t) 
for a success at time t is given by S{t) = iogi + iigo- Let In(t) = and 

Out(t) = 'S'(j)- Define Backlog(t) = In(t) — Out(t). Let A = Aq « 0.567 be 

the (unique) root of A = e~^. 

Lemma 5. For any genie and input rate A > Aq, there exists £ > 0 such that 

Prob[Backlog(t) > et for alH > T] — >• 1 as T — >• oo. 

Proof. Let 3^ = A — e~^ > 0. At any step t, S{t) is a Bernoulli variable with 
expectation 0, e~^, Xe~^, according as G{t) > 1, G{t) = 1, G{t) = 0, respectively, 
which is dominated by the Bernoulli variable with expectation e~^. Therefore 
E[Out(t)] < e~H, and also, Prob[Out(t) — e~^t < et for alH > T] — >• 1 as T — >• 
oo. (To see this note that, by a Chernoff bound, Prob[Out(t) — e~^t > et\ < 
for a positive constant 6. Thus, 

Prob[3t > T such that Out(t) — e~^t > et] < ^ 

t>T 

which goes to 0 as T goes to oo.) We also have E[In(t)] = At and Prob[At— In(t) < 
et for all t > T] — >• 1 as r — >■ oo, since In(t) = Poisson (At). Since 

Backlog(t) = In(t) — Out(t) = (A — e~^)t + (In(t) — At) -I- (e~^t — Out(t)) 
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= et+ {et -I- In(t) — Xt) + {et + e — Out(t)), 



the result follows. 



□ 



Corollary 2. No acknowledgement-based protocol is recurrent for X> Xq or has 
capacity greater than Aq. 

To strengthen the above result we introduce a restricted class of genies. We 
think of the messages which have failed exactly once as being contained in the 
bucket. (More generally, we could consider an array of buckets, where the jth 
bucket contains those messages which have failed exactly j times.) A 1-bucket 
genie, here called simply a bucket genie, is a genie which simulates a given 
protocol for the messages in the bucket and is required to choose a send value 
which is at least as great as the number of sends from the bucket. For such 
constrained genies, we can improve the bound of Corollary El 

For the range of arrival rates we consider, an excellent strategy for a genie is to 
ensure that at least one message is sent at each step. Of course a bucket genie has 
to respect the bucket messages and is obliged sometimes to send more than one 
message (inevitably failing) . An eager genie always sends at least one message, 
but otherwise sends the minimum number consistent with its constraints. 

An eager bucket genie is easy to analyse, since every arrival is blocked by 
the genie and enters the bucket. For any acknowledgement-based protocol, let 
Eager denote the corresponding eager bucket genie. Let A = Ai ~ 0.531 be the 
(unique) root of A = (1 + A)e“^^. The following lemma is proved in the full 
version. 

Lemma 6. For any eager bucket genie and input rate A > Ai, there exists £ > 0 
such that 

Prob[Backlog(t) > et for all t > T] — >■ 1 as T — ?> oo. 

Let Any be an arbitrary bucket genie and let Eager be the eager bucket genie 
based on the same bucket parameters. We may couple the executions of Eager 
and Any so that the same arrival sequences are presented to each. It will be 
clear that at any stage the set of messages in Any’s bucket is a subset of those 
in Eager’s bucket. We may further couple the behaviour of the common subset 
of messages. Let A = A 2 ~ 0.659 be the (unique) root of A = 1 — Ae“^. 

Lemma 7. For the coupled genies Any and Eager defined above, if Out^ and 
Out^; are the corresponding output functions, we define Z\Out(f) = Out£;(t) — 
OutA(t). For any X < X 2 and any e > 0, 

Prob[Z\Out(t) > —et for alH > T] — >• 1 as T — ?> 00 . 

Proof. Let co,ci,c* be indicators for the events of the number of common mes- 
sages sending being 0, 1, or more than one, respectively. In addition, for the mes- 
sages which are only in Eager’s bucket, we use the similar indicators eg, ei, e». Let 
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ao,ai represent Any not sending, or sending, additional messages respectively. 
(Note that Eager’s behaviour is fully determined.) 

We write Z(t) for Z\Out(t) — Z\Out(t — 1), for t > 0, so Z represents the dif- 
ference in success between Eager and Any in one step. In terms of the indicators 
we have 

Z{t) = Ssit) - SA{t) = io9Ei{t) + iigsoit) ~ 'io9Ai{t) - ii9Ao{t), 

where SE^t) is the indicator random variable for a success of Eager at time t 
and 9Ei{t) is the event that Eager sends exactly one message during step t (and 
so on) as in the paragraph before Lemma Thus, 

Z{t) > ipCo(ao(eo -I- Ci) - Uic*) - ioCi{ci + e,) - iiCoap. 

Note that if the number of arrivals plus the number of common bucket sends 
is more than 1 then neither genie can succeed. We also need to keep track of the 
number, AB, of extra messages in Eager’s bucket. At any step, at most one new 
such extra message can arrive; the indicator for this event is iiCpaoj i-e., there is 
a single arrival and no sends from the common bucket, so if Any does not send 
then this message succeeds but Eager’s send will cause a failure. The number of 
“extra” messages leaving Eager’s bucket at any step is unbounded, given by a 
random variable we could show ase = l-ei-|-2-e2 -!-••• . However e dominates 
6i -be, and it is sufficient to use the latter. The change at one step in the number 
of extra messages satisfies: 



AB{t) — AB{t — 1) = iiCoOo — e < iicoao — (ei -I- e*). 

Next we define Y{t) = Z{t)—a{AB{t)—AB{t—l)), for some positive constant 
a to be chosen below. Note that X{t) = J2*j=i ^U) = AOut(f) — aAB(t). We 
also define 

Y'(t) = ioCo(ao(eo + ei) — oie,) — ioCi(ei -I- e,) — ficpao — a(iiCoao — (ei -|- e,)) 

and X'(t) = Note that Y(t) > Y'(t). 

We can identify five (exhaustive) cases A,B,C,D,E depending on the values 
of the c’s, a’s and e’s, such that in each case Y'(t) dominates a given random 
variable depending only on I(t). 

A. c,: Y'(t) > 0; 

B. (ci -b coai)(ei -b e*): Y'(t)>a-io; 

C. (ci -b CoOi)eo: Y'(t) > 0; 

D. coao(eo -b ei): Y'(t) > io - (1 + a)ii; 

E. coope,: Y'(t) > a — (1 -b a)ii. 

For example, the correct interpretation of Case B is “conditioned on (ci -b 
cpai)(ei -b e,) = 1, the value of Y'(t) is at least a — ip.” Since E[ip] = e~^ and 
E[ii] = we have E[E'(t)] > 0 in each case, as long as maxle""^, Ae“'^/(1 — 

Ae“^)} < a < 1/A — 1. There exists such an a for any A < A 2 ; for such A we 
may take the value a = e~^, say. 
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Let Tt be the cr-field generated by the first t steps of the coupled process. 
Let Y{t) = Y'{t) — E\Y'{t) \ Tt-\] and let X{t) = The sequence 

X(0), X(l), . . . forms a martingale (see Definition 4.11 of ^(|) since E\X{t) \ 
Tt-i] = X{t — 1). Furthermore, there is a positive constant c such that \X{t) — 
< c. Thus, we can apply the Hoeffding-Azuma Inequality (see Theorem 
4.16 of 1201 )• In particular, we can conclude that 



Prob[X( < —et] < 2 exp 




Our choice of a above ensured that E[Y'{t) \ Tt~]\ > 0. 
and X'{t) > X{t). We observed earlier that X{t) > X'(t). 
so we have 



Prob[A't < —et] < 2 exp 




Hence Y'{t) > Y{t) 
Thus, X{t) > x\t) 



Since 2 exp 




converges, we deduce that 



Prob[A(t) > —et for alH > T] — >• 1 as T — >• oo. 



Since Z\Out(t) = X{t) + aAB{t) > X{t), for all t, we obtain the required 
conclusion. □ 



Finally, we can prove the main results of this section. 

Theorem 9. Let P be an acknowledgement-based protocol. Let A = Ai ~ 0.531 
be the (unique) root of X = {1 X)e~^^. Then 

1. P is transient for arrival rates greater than Ai; 

2. P has capacity no greater than Ai. 

Proof. Let A be the arrival rate, and suppose A > Ai. If A > Aq ~ 0.567 then 
the result follows from Lemma 0 Otherwise, we can assume that A < A 2 ~ 
0.659. If E is the eager genie derived from P, then the corresponding Backlogs 
satisfy Backlogp(t) = Backlogp(t) -I- Z\Out(t). The results of Lemmas El and 0 
show that, for some e > 0, both Prob[Backlogp(t) > 2et for alH > T] and 
Prob[Z\Out(t) > —et for all t > T] tend to 1 as T — >■ 00 . The conclusion of the 
theorem follows. □ 
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Abstract. We consider broadcasting in radio networks: one node of the 
network knows a message that needs to be learned by all the remaining 
nodes. We seek distributed deterministic algorithms to perform this task. 
Radio networks are modeled as directed graphs. They are unknown, in 
the sense that nodes are not assumed to know their neighbors, nor the 
size of the network, they are aware only of their individual identifying 
numbers. If more than one message is delivered to a node in a step then 
the node cannot hear any of them. Nodes cannot distinguish between 
such collisions and the case when no messages have been delivered in a 
step. 

The fastest previously known deterministic algorithm for deterministic 
distributed broadcasting in unknown radio networks was presented in 0 , 
it worked in time We develop three new deterministic dis- 

tributed algorithms. Algorithm A develops further the ideas of 0 and 
operates in time for general networks, and in time 

for sparse networks with in-degrees 0(n“) for a < 1/2; 
here H is the entropy function. Algorithm B uses a new approach and 
works in time log^^^ n) for general networks or for 

sparse networks. Algorithm C further improves the performance for gen- 
eral networks running in time 
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1 Introduction 

Wireless communication has become popular due to the recent advent of new 
technologies. It is present not only in the ubiquitous cordless and cellular phones, 
but also in personal communication services, mobile data networks, and local- 
area wireless networks (cf. M)- Protocols used to communicate over wireless 
networks assume as little as possible about the topology of the network, because 
it may change over time, especially in land-mobile radio networks. Communica- 
tion over such networks creates challenging algorithmic problems. 

We consider an abstraction of a wireless network called a radio network and 
modeled as a directed graph. An edge from node v\ to t >2 corresponds to the fact 
that messages transmitted by vi can be directly received by V 2 ; in other words, V 2 
is in the range of the transmitter at vi . The underlying feature of communication 
in radio networks is that a node that simultaneously receives messages from at 
least two transmitters cannot hear any of them because of mutual interference. 
If a node cannot hear a message then it hears some noise, distinct from any 
meaningful message. If the noise heard by a node while no messages have been 
sent to it is distinct from the noise heard when many messages have been sent 
to it then the model is said to be with collision detection. 

We consider the problem of dissemination of information in radio networks. 
In its simplest variant, we have one node of the network, called a source, which 
stores a message that needs to be learned by all the remaining nodes. This 
specific communication task is called the problem of broadcasting. We seek dis- 
tributed (decentralized) algorithms to perform broadcasting in radio networks. 
We restrict our attention to deterministic algorithms. 

The nodes of the network do not know the topology of the underlying graph, 
except for their own individual numbers and possibly the total number of nodes. 
In particular, a node is not assumed to know the numbers of the nodes with 
whom it could communicate directly. 

The nodes of the network have access to a global clock and operate in steps. 
The performance of algorithms is measured by their worst-case time behavior 
over all the possible networks of a given size. 

Review of prior work. A lower bound Q{n) for deterministic distributed 
broadcasting in unknown radio networks was proved in The best currently 
known lower bound is I7(nlgn), see |EI. The first distributed deterministic algo- 
rithms for unknown radio networks were presented in jS|; however the networks 
considered there were quite restricted, namely, nodes were assumed to be located 
in a line, and each node could reach directly all the nodes within a certain dis- 
tance. A systematic study of deterministic distributed algorithms in unknown 
radio networks modeled as directed graphs was undertaken in p]. This paper 
compared the broadcasting power of radio networks distinguished by the avail- 
ability of collision detection. The problem of broadcasting was considered in ^ 
in two variants, depending on whether the source was required to be informed 
about the task having been completed {radio broadcasting with acknowledge- 
ment) or not {radio broadcasting without acknowledgement) . It was shown that 
the former task could not be performed on a radio network if nodes do not know 
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the size of the network in the model without collision detection. This was shown 
to hold even if the underlying graph is symmetric, that is, for each edge in the 
graph the edge with the reversed direction is also available. Algorithms were de- 
veloped in 1^ for the problem of broadcasting without acknowledgement in the 
model without collision detection. One of them operated in time 0{n), but was 
restricted to the case when the underlying graph was symmetric. The algorithm 
developed in for general networks had performance The model with 

collision detection is capable of performing acknowledged radio broadcasting, it 
was shown in jO] how to achieve this in time 0{n) in symmetric graphs, and in 
time 0{n ■ ecc) for general strongly connected graphs, where ecc is the largest 
distance from the source to any other node. 

Summary of contributions. We consider the problem of broadcasting in un- 
known radio networks by distributed algorithms. The model is without collision 
detection. The broadcasting is without acknowledgement. We develop three de- 
terministic algorithms. 

Algorithm A operates in time for any e > 0, for general networks. 

The constant A is the root of the equation A-I-H(A) = 1, where H is the (binary) 
entropy function; numerically, the bound is Algorithm A 

operates in time 0 (ni+“+H(“)+o(i)) for sparse networks, in which in-degrees of 
nodes are 0(n“) for a < 1/2. Algorithm A is developed by employing many 
selective families of subsets of [l..n] simultaneously. The notion of a selective 
family was introduced in ^ to develop an algorithm working in time 
which used just two special selective families. 

Algorithm B uses transmissions based on the processor ID modulo p for many 
primes p. It runs in time log^^^ n) or for sparse networks. 

Algorithm C works in time The underlying paradigm is to take a 

prime number p such that p^ >n and to use for simultaneous transmission sets 
of points in [0..p — 1] x [0..p — 1] that are lines described by equations over the 
field Fp. Algorithm C is the fastest that we know of in the general case. 

Other related work. Early work on radio communication dealt with the single- 
hop radio network model, which is a channel accessible by all the processors 
(see Oj ). for recent results see m This is a special case of our model when 
the graph is complete and symmetric. The general case has been called the 
multi-hop radio network. 

Much of the previous work on distributed broadcasting in radio networks has 
concentrated on randomized algorithms (j3EI3). In a randomized protocol 
was developed that works in time 0{{D -\- Ign/e) ■ Ign) with probability 1 — e, 
where D is the diameter of the network. This is close to optimal as follows from 
the known lower bounds. One of them, proved in P|, is 12 (Ig^ n), and holds even 
for graphs of a constant depth; the other n{D\g{n/ D)) was shown in [II 

Another area of research on radio networks was how to compute an optimal 
schedule of broadcasting, by a centralized algorithm, given a description of the 
network. It was shown in |2j that this problem is NP-hard; this holds true even 
when restricted to graphs induced by nodes located in the plane and edges 
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determined by ranges of nodes, see m- On the other hand, it was shown in 
that broadcasting can be performed in time 0{D + Ig® n). 

Simulations between the synchronous point-to-point message-passing model 
and the radio model were presented in |5|. Fault-tolerance issues of distributed 
radio broadcasting were studied in irm . 

2 Model of Computation 

Radio network. The network is modeled as a directed graph. Node v\ is a 
neighbor or successor of node V 2 if there is a link from V 2 to v\. It is assumed 
that for any node v of the network there is a directed path from the source to v. 

In each step a node may choose to be in one of two modes: either the receive 
mode or the broadcast mode. A node in the broadcast mode transmits a message 
along all its out-going links. A message is delivered to all the recipients in the 
step in which it was sent. A node in the receive mode attempts to hear the 
messages delivered along all its in-coming links. If a message has been delivered 
to a node it does not necessarily mean that the node is able to hear it. The basic 
feature of the radio network is that if more than one link brings in messages to a 
node V during a step then v does not hear any of the messages, and all of them 
are lost as far as v is concerned. Node v can hear a message delivered along a 
link if this link is the only one bringing in a message. If more than one message 
arrives at a node at one step then a collision at the node is said to occur. The 
model of radio communication is in two variants, depending on the ability of 
nodes to detect collisions. If a node cannot hear a message during a step then it 
may be assumed to hear some default signal called noise. Noise is distinct from 
any meaningful source messages. If there is only one noise signal then it is called 
the background noise and the model is said to be without collision detection. 
The model with collision detection allows two different noise signals: no messages 
delivered produces the background noise, but more than one produce interference 
noise. In this paper we work with the model without collision detection. 

Local knowledge. If the network consists of n nodes then each of them is 
assigned a unique integer in the range 0 through n — 1 which is the identification 
number (ID) of the node. Nodes have a restricted knowledge of the underlying 
graph. Each node knows only its ID, and if it is the source node. Nodes are not 
assumed to know the IDs of nodes to which they are connected. Nodes are also 
not assumed to know the size of the network; however, to simplify the exposition 
of algorithms we refer explicitly to the size n; this can be avoided, see Section E3 

A broadcasting algorithm is said to have completed broadcasting when all the 
nodes have learned the source message. When this happens the nodes need not be 
aware of it. Nodes do not send any feedback or acknowledgement information to 
other nodes. The algorithm we present may terminate after a prescribed number 
of steps, which is yielded by the performance bounds that we prove. 
Distributed setting. The communication protocol operates in steps synchro- 
nized by a global clock. During a step each node may make an attempt to send a 
message or to receive a message. The nodes of the network are processing units. 
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each able to perform local sequential computations. Whenever a node needs to 
perform local computation to decide the next send/receive operation, it is as- 
sumed that this local computation can always be performed during the current 
step. Similarly we do not assume any restrictions on the size of local memories. 

At each step a node decides whether to broadcast, to remain quiet or to 
terminate. The broadcast mode is entered by a node only if the source message 
has been already received by that node. Each decision of node v concerning 
broadcast/receive mode at a current step depends on the following information: 

— the status of v: is v the source node or not; 

— the number of the current step; 

— the size n of the network; 

— the ID of node v; 

— if has already heard the source message or not. 

Algorithms. An algorithm is represented by a sequence of sets Tq,Ti,T 2 , . . . 
called transmissions. Each Ti C [0..n — 1] is a set of IDs of nodes. Set Tg consists 
of only the source node. A node v enters the broadcast mode at step z, and hence 
sends the source message to its neighbors, if and only if the following conditions 
are simultaneously satisfied: 

— the ID of V is in Ti, 

— node V has already heard the source message. 

A simple algorithm was defined in 0 by taking Ti = {i mod n}, for z > 0. 
We call it the round-robin algorithm. It completes broadcasting in time O(n^) 

(Notation Ig denotes the logarithm to the base 2. We assume that each num- 
ber in [0..7Z — 1] has exactly [IgrzJ -I- 1 digits in its binary representation, obtained 
by padding the most significant positions with zeros if necessary. The least sig- 
nificant position has number 1 and the most significant has number [IgzzJ -1-1.) 
Broadcasting paradigm. A node evolves through three conceptual states in 
the course of a broadcasting algorithm. If it does not know the source message 
yet, it is uninformed. A node becomes active in the step when it learns the source 
message for the first time. In the beginning the source node is the only active 
node and the remaining nodes are all uninformed. An active node changes its 
status to passive when the last of its neighbors learns the source message. 

Lemma 1. If a node broadcasts as the only active node it becomes passive. 

Proof. Since uninformed nodes do not broadcast, the active node z and possibly 
certain passive nodes are the only ones which broadcast. Passive nodes cannot 
interfere with the receipt of the message by uninformed neighbours of i since 
passive nodes have no uninformed neighbours. 

Each time a node changes its status we say that progress has been made. Each 
change from uninformed to active or from active to passive contributes a unit 
to the measure of progress, and change from uninformed to passive contributes 
two units. After progress 2n—l has been accrued we know that all the nodes are 
passive, so all the nodes know the source message by the definition of passive. 
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Selective families. Following 0, we say that a family TZ of subsets of [0..n — 1] 
is y-selective, for a positive integer y, if for any subset Z C [0..n — 1] such that 
\Z\ < y there is a set S' G 7?. such that \S C\ Z\ = 1. We also say that a family 
72. of subsets of [0..n — 1] is strongly y-selective^ for any positive integer y, if for 
any subset Z C [0..n — 1] such that \Z\ < y, and any z G Z, there is a set S G 72 
such that S r\ Z = {z}. 

3 Selective Families Streamlined 

In P specific selective families were considered defined as follows. Let W = 
(ici, . . . , Wfc) be a sequence of integers such that 1 < w\ < W 2 < ■ ■ ■ < Wk < 
[Ig nj + 1. Let i? = (6i, . . . , 6fc) be a sequence of bits, each equal to either 0 or 1. 
Set S{W,B) is defined to consist exactly of those integers in [0..n — 1] which 
have the iCi-th digit in their binary expansions equal to bi (1 < i < fc). 

The number k used above is said to be the width of sets W, B and S{W, B). 
The family 72(fc) is defined to consist of all the sets of width k of the form 
S{W,B). Hence set S = S{W,B) G 72(fc) is determined by selecting both a 
certain k positions in the binary expansions of numbers in [0..n— 1] and also the 
digits on those positions. 

The key property of 72(fc) is that it is 2*-selective (IHI)- To see this, take 
Z C [0..n — 1] where \Z\ < 2*. Let wi be the smallest integer such that there 
are two elements in Z such that their binary expansions differ on position w\. 
This means that the sets Zi consisting of those x G Z that the ici-th digit in the 
binary expansion of a: is i, for i G {0, 1}, are both nonempty and disjoint. Let bi 
be the number with the property that |ZbJ < |Zi_bJ. Hence 0 < |ZbJ < ^\Z\. 
Replacing Z by Z},-^ and continuing in a similar fashion we obtain sequences 
W = {wi , . . . , Wi) and B = {bi,. . . ,bi) such that |S'(W, B) Z\ = \ and i < k. 
They can be padded to sequences of length exactly k if i < k. 

Sets from families 72(fc) can be used as transmissions. The round-robin algo- 
rithm uses just singleton sets, which is the family 72([lgnJ -I- 1). The algorithm 
of 0 uses these singletons interleaved with some other specific family TZ{k). 

We use O(lgn) families 72(fc) simultaneously. Algorithm A is organized as a 
loop which iterates phases. A phase consists of 0(lgn) transmissions, each from 
a different family TZ{k). The algorithm cycles through all the elements in 72(fc). 
To be specific, let us fix an ordering {Si,S 2 , . . . ) of all the sets in 72(fc). Then in 
phase i set Sj is the transmission from 72(fc), where j = i mod |72(fc)|. 

Family 72([lgnJ -I- 1) has exactly n singleton sets. So during n consecutive 
phases each node with the source message will have an opportunity to be the 
only node performing broadcasting in the whole network. Hence its neighbors 
will all hear the message. We do not expect to gain much using sets from TZ{k) 
as transmissions if |72(/c)| > n, and they do not matter as far as the analysis of 
performance of algorithm A is concerned. 

Let fc = a; • Ign, for 0 < a; < 1/2. Then 

|72(fc)| = ^ ^ 
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where H(a;) = — [a;lgcc + (1 — a;) lg(l — x)] is the (binary) entropy function. 

Let 0 < A < 1/2 be the solution of the equation A + H(A) = 1, A = 0.22709.... 
The families TZ{k) for fc < Algn have 0{n) subsets of [0..n — 1] each. 

Algorithm A has phases that use transmissions from each TZ{k) such that 
k < Xlgn, and additionally from 7^([lgnJ + 1), which are singletons. 

Let us consider the contribution to progress from any s + 1 families TZ{ki), 
TZ{k2), ■ . ■ , TZ{ks) and 7^([lgnJ + 1 ), where k\ < k2 <■■■< kg < [IgnJ + 1 = 
ks+i- We assume |7^(A:i)| = 0{n). Transmissions from these families are inter- 
leaved, during step t we use a transmission from TZ{kx) where x = (t mod (s -I- 
1)) -|- 1. Transmissions from TZ{ki) are used in some cyclical order. 

Let F be the set of active nodes at the start of a specific phase t. We consider 
a number of cases. 

Case 1; A progress is made during the phases t through t+ |7^(fci)|. 

If Case 1 does not hold, consider the process described above demonstrating 
that TZ{ki) is 2 ^^ -selective but now take both bj = 0 and bj = 1 at step j; this 
will give 2^1 transmissions Ci C TZ{ki) such that all the sets of the form x D F, 
for X G Cl, are disjoint, and each includes at least two elements. 

If Ti G 7Z(x), T 2 G 7Z(p), X < y and T 2 C T\ then T 2 is said to refine Ti. 

Case 2; Case 1 does not hold and during the phases t through t+\TZ{k 2 ) \ 
at least half of transmissions in Ci are refined by transmissions in TZ(k2), 
each of them including exactly one element from F. 

If neither of Cases I and 2 holds then there is a set C( C Ci of transmissions 
such that \C[\ = 5 IC 1 I, and each transmission T G C( is refined by 
transmissions in some A{T) C TZ{k 2 ) such that if S' G A(T) then |S fl F| > 1. 
Let C 2 = UtgCI MT)- Then IC 2 I = | • 2'=C 

Case 3: Cases 1 or 2 do not hold and during the phases t through 
t + \R.{kfi)\ at least half of the transmissions from C 2 are refined by 
transmissions from TZ{k 3 ), each including exactly one element from F. 

Cases up to s are defined similarly by properties of sets C^, for 1 < i < s. 
If Case s holds then during the phases t through t + \TZ{ks)\ at least half of the 
transmissions from Cs-i are refined by transmissions from TZ{ks) such that each 
of them includes exactly one element from F. 

Case s -l- 1: None of the Cases 1 through s holds. 

If Case s -I- I holds then |F| > 2“® • 2^'*. 

Let us estimate the average progress per phase over a period starting at 
phase t when Case i holds; it is denoted by at. If Case I holds then 

during |7^(fci)| phases. If Case i holds, for 2 < z < s, then 

a, • 2 -*+i. |7^(A:,)r^ 
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during \R.{ki)\ phases. If Case s + 1 holds then during n phases the round robin 
contributes progress at least 

a^+i > \F\-n~^ > ■ 2 "" • . 

We are guaranteed an average progress among ai, . . . ,o;s+i) because one of 
the cases 1 through s + 1 always holds. To make the estimations of these numbers 
largest we seek to make our estimates of oi, . . . , ag+i asymptotically equal: 

| 7 ^(fcl)|-l = 0(2'=^ • \n{k 2 )\-^) = ■■■ = e( 2 '=-l| 7 ^(fc,)|-l) = 0(2'=“ • n-i) . 

(1) 

Let ki = ailgn. Equations dB translate into 

— oi — H(ai) = oi — 02 — H(o 2) = • • • = Os_i — Os — H(os) = Og — 1 . ( 2 ) 

Equations m are equivalent to the following system of equations: 

2os + H(og) = Og—i + 1 

20 s-l + H(Os_i) = Os_2 + Og + H(Og) 

2o 2 + 11(02) = oi + 03 + 11(03) 

2 oi+H(oi) = 02 +H(o2) 

Let function / be defined as f{x) = x + H(x), for 0 < a; < 1 / 2 . Our system of 
equations can be rewritten as follows using notation with /: 

1 - /(Og) = Og - Og_i 
Os_i-Og_2 = /(Og) - /(Og_i) 

Os_2-Og_3 = /(Og_i) - /(Og_2) 



02 - ai = /(os) - /(02) 

Ol = /(02) - /(oi) 



We can equate f{y) — f{x) = f'{z){y — x) for certain z satisfying x < z < y. 
Numbers Oi,...,Og are in [0..A], where /(A) = 1. If 0 < z < A then f{z) > 
/'(A) > /'(1/2) = 1. We obtain the following estimation: 



1 - f{as) = Og - Og_i < j7^(/(«^) - /(os-i)) 

(Og_i - Og_2) < } if (Qs-i) - /(Og_2)) < . . . 



< 



/'(A) 

1 

UW] 



[f'WV 






^(/(o2 )-/(oi))<oi.[/'(A)]^-«<[/'(A)] 



This shows that 1 — /(og) converges to 0 as s — >■ 00 . It follows that limg_>.oo Og = A 
because / is a continuous function. 
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Theorem 1. For any e > 0, algorithm A completes broadcasting in time 
for A = 0.22709... defined by X + H(A) = 1. 

Proof. The average progress after phase t is at least 

per phase. To achieve progress 2n — 1 we need 0{n/n°“~^) = phases. 

A phase takes O(lgn) = 0{n^) steps, and lims_>oo a-s = A. 

Corollary 1. Algorithm A operates in time 

Next we consider graphs with bound d = d(ji) on in-degrees of nodes. 

Theorem 2. Algorithm A completes broadcasting in time 0{n ■ (pl^i) • dlgn) 
if in-degrees are bounded by d. 

Proof. Consider the contribution to progress made by TZ{k) for k = [Igd]. 
Let V be an uninformed node that is a neighbor of an active node. There 
are at most d such active nodes for v. Family TZ{k) is d-selective. Hence dur- 
ing 0{\TZ{k)\) phases there will be a step where just one of these active nodes 
broadcasts and informs node v. The total time of algorithm A is thus bounded 
by C>(n|7^(fc)|lgn) = 0{n ■ • dlgn). 

Corollary 2. Ifd = 0{n°'), fora < 1/2, then algorithm A completes broadcast- 
ing in time for any constant e > 0. 

4 Multi-prime Algorithms 

Let Pi denote the z-th prime number and k{x) denote the smallest integer such 
that Pi > n^~^. Every transmission will be of the form Sij = {m|0 <m< 

n — 1 and m = j (mod pi)} for some prime pi and some j < Pi. The set of all 
transmissions Sij for a prime pi is called a pi stage and two stages will be called 
different if they use different primes pi and pj. The union of all pi stages with 
Pi < ^(2®) will be called the s-family. We note that it is a strongly (2® (-selective 
family since every node transmits once for each prime Pi and cannot be blocked 
by a given other active node for a set of primes with product greater than n. 

Lemma 2. If at time tg some uninformed node m has I informed predecessors 
and no uninformed node has more than I informed predecessors and the trans- 
missions between to and t\ include different pi^ stages (^ < j < m) such that 
njLiPij ^ , then there is progress at least I between to and t\. 

Proof. Let m'l , • • • , mj be the informed predecessors of m. If they all become 
passive or I new nodes become active between to and t±, then there has been 
progress at least 1. Otherwise some to/ remains active so it has a successor to* 
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which remains uninformed; m* had at most I informed predecessors at Iq and 
acquired less than I more between tg and ti. The union of the Pi^ stages is a 
strongly (2Z)-selective family so there is a transmission between tg and ti in 
which m'p, and none of these other predecessors of m* transmits. So m* would 
become active at the latest at that transmission so that this case cannot arise. 

We can build different versions of our algorithm using s+ 1-families depending 
on whether a bound D on the in-degree is known and whether we want an 
algorithm which takes advantage of the existence of low in-degree. 

Algorithm B1(D) repeats the s-family (ordered as a succession of its fc(2*) stages) 
for s = [Ig 2£>] . 

Algorithm B2(D) interleaves 51(2'*) for all s up to \\g2D\. 

Algorithm B3 interleaves round robin with B1{^J n / \gn) . 

Algorithm B4 interleaves B3 with i?2(-yn/ Ig^^^ n). 

Theorem 3. Algorithms B run in time: 

Bl: 0{nD\^ n/ \g{D\gn)) if all in-degreess are at most D; 

B2: 0{nd\gD\^ n/ \g{d\gn)) if all in-degrees are at most d < D; 

B3: 0(n3/2lgi/2 n); 

B4-' O {min{n^ ^ n, ndlg^ n/ lg{dlgn))) if all in-degrees are at most d. 
Proof. 

Bl: Since no uninformed node can have more than ^predecessors, using LemmaEl 
we can divide the time into intervals where each interval achieves progress at 
least I in a sequence of stages using primes with product less than n'^K Since these 
primes are 0{Dlgn), their sum is 0{j^^^Dlgn) = 0( igfij^g") )■ This sum is 
the number of steps in the interval, giving time per unit progress of ig(jfjg’)t) )■ 

B2: Bl(d) would terminate in time 0{nd\^ n/ \g{d\gn)) on its own and so 
will terminate in time 0{nd\gD\^ n/lg(dlgn)) when running interleaved with 
OifgD) other algorithms. 

B3: Starting at any time when the number of active processors is less than 
y/njlgn, LemmaElgives us an interval with time per unit progress of 0{i/n\gn) 
by plugging in = \Jnj\gn in Bl; if the number of active processors is at least 
^nj\gn, the round robin gives progress at least \pnj\gn in the succeeding 
interval of length 2n. In both cases we obtain an interval with time per unit 
progress of 0{\/nlgn). 

B4: If d < \Jn/ Ig^^^ n, the interleaved B2 guarantees termination within 
0{ndlg^ n/lg{dlgn)) steps; otherwise the S3 guarantees termination within 
Ig^^^ n) steps. In each case this gives the time bound stated. 

5 Single-Prime Algorithm 

Let p be the smallest prime greater than or equal to [v/nj • Then p^ = 0{n). 
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Each of the numbers i € [0..p^ — 1] can be represented uniquely by a pair of 
integers (a;, y) where i = x ■ p + y and 0 < x,y < p — 1. Such a pair (a;, y) are 
the coordinates of i. We treat all the numbers in — 1] as IDs of nodes. We 
refer to them by their coordinates rather than IDs. 

The integers in [0..p-I] form a field Fp if the arithmetic is modulo prime p. For 
a,b,c € Fp, such that at least one among the numbers a and b is distinct from 0, 
let L{a, b, c) be the set of nodes whose coordinates {x, y) satisfy the equation 

a ■ X + b ■ y = c 

in Fp. Each set of nodes of the form L(a, b, c) is called a line. Each line has size p. 
There are exactly p — 1 lines disjoint with a given one. Two disjoint lines are 
said to be of the same direction. The total number of lines is p{p + 1) = 0{n). 
Each node belongs to p + 1 lines. For any two different nodes vi and V 2 there is 
exactly one line containing v\ and V 2 - For any two lines L\ and L 2 of different 
directions there is exactly one node v such that v belongs to both L\ and L 2 . 

Algorithm C uses singleton sets and lines, singletons in even-numbered steps 
and lines in odd-numbered ones. During each consecutive p^ even-numbered steps 
each singleton is used exactly once. Similarly, during each consecutive p{p + 1) 
odd-numbered steps each line is used exactly once. Moreover, lines of the same 
direction are used in consecutive odd-numbered steps. A stage consists of 2(p-|-l) 
consecutive steps during which all the lines of the same direction are used. 

Lemma 3. Let F be a set of nodes such that |E| < p/2. Then, for each node 
V € F, during each consecutive 2\F\ stages, there are at least |F| steps, each in 
a different stage, during which v broadcasts as the only element of F. 

Proof. Let k = |F|. Let Ti,...,T 2 k be the lines including v used during 2k 
consecutive stages. Then T\ — {ui}, . . . , T 2 k — {ui} are all disjoint. By the pigeon- 
hole principle, at most fc — 1 of them include elements from F . 



Lemma 4. Let F he the active nodes at the beginning of stage t. If \F\ < p/2 
then the average progress per stage during 2\F\ stages starting from t is f?(l). 

Proof. Let k = |E|. Consider the stages T = {t,t + 1, . . . ,t + 2k — 1}. If each 
of the nodes in F broadcasts during T as the only active node then the average 
progress per stage during T is at least 1/2. Suppose this is not true and let u G F 
be such that whenever it broadcasts in T then some other active node broadcasts 
simultaneously. By Lemma El there are at least k steps, each in a different stage 
in T, such that v broadcasts during these steps as the only element of F. If some 
active v' broadcasts simultaneously then v' is not in F . Any two nodes broadcast 
together exactly once during all the stages. Hence if v is blocked from being the 
only active broadcasting node during k stages by nodes outside F, there need to 
be at least k distinct such nodes. None of them was active at the beginning of 
stage t, hence all of them acquired the active status during T. This again gives 
average progress of at least 1/2 per stage during T. 
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Theorem 4. Algorithm C completes broadcasting in time 

Proof. We estimate the time needed to make progress 2n — 1. If at any stage 
the number a of active nodes is at most p/2 then we have a constant progress 
averaged over the next 2a stages by Lemma 0 If the number of active nodes 
is greater than p/2 at any stage then the round-robin will contribute at least 
p/2 = n{y/n) during the next 0(n) steps. In either case the average progress 
per step is Hence after 0{n/n~^^^) = 0(r?!'^') steps all the nodes are 

passive, and so have learned the source message. 

6 Discussion 

Our algorithms were presented as if the nodes knew the size of the network. This 
assumption can be avoided if the broadcasting is without acknowledgement and 
the protocol works in time p{n), that is polynomial in the size n of the network, 
which is the case in this paper. The modified algorithm works in consecutive 
intervals of length 0(p(2*)), with only the first 2® nodes participating in the i-th 
interval. The total time in which broadcasting is completed is again 0(p(n)). 
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Abstract. We consider the process theory PA that includes an oper- 
ation for parallel composition, based on the interleaving paradigm. We 
prove that the standard set of axioms of PA is not cu-complete by pro- 
viding a set of axioms that are valid in PA, but not derivable from the 
standard ones. We prove that extending PA with this set yields an lj - 
complete specification, which is finite in a setting with finitely many 
actions. 



1 Introduction 

The interleaving paradigm consists of the assumption that two atomic actions 
cannot happen at the same time, so that concurrency reduces to nondetermin- 
ism. To express the concurrent execution of processes, many process theories have 
been accomodated with an operation for parallel composition t hat behaves ac - 
cording to the interleaving paradigm. For instance, CCS (see, e.g., iMilner 1198^1 
has a binary operation for parallel composition — we shall denote it by _ || _ — 
that satisfies the so-called Expansion Law: 

m n m n 

if P = ® then p\\q^ Ik) + '^bj.{qj \ \p); 

2=1 j—^ i—1 

here the a^.- and the bj.- are unary operations that prefix a process with an 
atomic action, and summation denotes a nondeterministic choice between its 
arguments. 

The Expansion Law generates an infinite set of equations, one for each pair of 
processes p and q. iBergstra and Kloo ( 1984 1 enhanced the equational character- 
isation of interleaving. They replaced action prefixing with a binary operation 

* A full version that contains the omitted proofs is available as CWI Technical Report; 
see http://www.cwi.nl/~luttik/ 
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_ • _ for sequential composition and added an auxiliary operation 1 1 (the left 
merge; it is similar to ||, except that it must start execution with a step from 
its left argume nt). Their axio matisation is finite for settings with finitely many 
atomic actions. iMoller 1199(1) proved that interleaving is not finitely axiomatis- 
able without an a uxiliary operation such as the left merge. 

The axioms of iBergstra and Klop (1983) form a ground-complete axiomati- 
sation of bisimulation equivalence; ground terms p and q are provably equal if, 
and only if, they are bisimilar. Thus, it reflects for a large part our intuition 
about interleaving. On the other hand, it is not optimal. For instance, it can be 
shown by means of structural induction that every ground instance o f the axiom 
II (y II - 2 ^) ~ II 2 /) II is derivable (seelRa,eten a.nd Weiiland (199(tl): however, 
the axiom itself is not derivable. 

If an equational specification E has the property that E\- ^ u°' for all 

ground substitutions a implies that E h t Ri m, then E is called u-complete (or: 
inductively closed). To derive any equation from such an equational specification 
it is never needed to use additional proof techniques such as structural induction. 
Therefore, in applications deali ng with theorem p rovin g:, g;-completen ess is a 
desirable property to have Isee iLazrek et al. ('IQflClIl ). In lHeering ('198lt l it was 
ar gued that cc-co mpleteness is desirable for the partial evaluation of programs. 

iMoller ( 1 obtained an w-complete axiomatisation for CCS without com- 
munication, by adding a law for standard concurrency: 

{x\ly)\\_zf^x\\_{y\\ z). 

In this paper we shall address the question whether PA, the subtheory of ACP 
without co mmunication and encapsulation, is w-complete. While the algebra 
studied by IMoller (1 has sequential composition in the form of prefix mul- 
tiplication, PA incorporates the (more general) binary operation • for sequential 
composition. Having this operation, it is no longer sufficient to add the law for 
standard concurrency to arrive at an w-complete axiomatisation. However, sur- 
prisingly, it is sufficient to add this law and the set of axioms generated by a 
single scheme: 



(a: • a [|_ a) ~ (a: [|_ a) • a, 

where a ranges over alternative compositions of distinct atomic actions; if the 
set of atomic actions is finite, then this scheme generates finitely many axioms. 

An important part of our proof has been inspired by the excellent work 
of Hirshfeld and Jerrum (199EI ) on the decidability of bisimulation equivalence 
for normed process algebra. In particular, they distinguish two kinds of mixed 
equations, in which a parallel composition is equated to a sequential composition. 
The first kind consists of equations 



(t • a*) II a* ~ (t II a^) • a* 



for positive natural numbers k and I, and for sums of atomic actions a. These 
equations can be derived using standard concurrency and our new axioms. The 
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second kind of mixed equations are th e so-called pumpable equation s, which are 
of a more complex nature (see p. 419 of lHirshfeld and Jerrum llQQfi 'l'l. Basically, 
we show that there cannot exist pumpable equations that contain variables by 
associating with every candidate t k. u & ground substitution cr such that 9 ^ 
u^. 

The notion of w-completeness is related to action refinement, where each 
atomic action may be refined to an arbitrary process. That is, in a theory with 
action refinement, the actions take over the role played by variables in our theory; 
the actions , as they occur in our theo ry, are not present in theories for action 
refinement. lAceto and Hennessy (I993ll presented a complete axiomatisation for 
PA (including a special constant nil, being a hybrid of deadlock and empty 
process) with a ction refinement, modulo timed observational equivalence from 
iHennessv 11 98!^ ). In this setting, laws such as o [|_ a: ~ a • a;, which hold in 
standard PA, are no longer valid, as the atomic action a can be refined into any 
other process. 

This paper is set up as follows. In we introduce the standard axioms of 
interleaving, and we prove that they do not form an w-complete specification by 
proving that all ground substitution instances of our new axioms are derivable, 
while the axioms themselves are not. In m we state some basic facts about the 
theory of interleaving that we shall need in our proof of cj-completeness. In ^ 
we collect some results on certain mixed equations, and in ^ we investigate 
a particular kind of terms that consist of nestings of parallel and sequential 
compositions. In we prove our main theorem, that the standard theory of 
interleaving enriched with the law for standard concurrency and our new axioms 
is w-complete. 

2 Interleaving 

A proeess algebra is an algebra that satisfies the axioms A1-A5 of Table El 
Suppose that A is a set of constant symbols and suppose that || and |J_ are 
binary operation symbols; a process algebra with interpretations for the constant 
symbols in A and the operations || and [[ satisfying Ml, M4, M5, M2a and M3a 
for all a S A, and M 6 q for all sums of distinct elements of A, we shall call an 
A-merge algebra; the variety of A-merge algebras we denote by PA 4 . 

Table 1. The axioms of PAa, with a G A and a any sum of distinct elements of A. 



(Al) x + y 
(A2) x + (y + z) 
(A3) X + X 
(A4) (x + y)- z 
(A5) {x-y)- z 



y + x (Ml) x\\y ^x\\_y + y\\_x 

{x + y) + z (M2a) a |J_ a; ^ a ■ x 

X (M3a) a • a; U_ 1 / «a-(a;||j/) 

X ■ z + y ■ z (M4) {x-\-y)W_z^x\iz-\-y\iz 

x-{y-z) (M5) {x\fy)\iz ^ x\i{y\\z) 

(M 6 c) X ■ a\\_a «(o;|J_a)-a 
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The axioms A1-A5 together with the axioms M1-M4 form the standard ax- 
iomatisation of interleaving. Consider the single-sorted signature S with the 
elements of A as constants and the binary operations -I-, •, [|_ and ||. In writ- 
ing terms we shall often omit the operation • for sequential composition; we 
assume that sequential composition binds strongest and that the operation -|- 
for alternative composition binds weakest. 

Let TZ consist of the axioms A3-A5 and M1-M4 of Table 0 interpreted as 
rewrite rules by orienting them from left to right. The term rewriting system 
(A, TZ) is ground terminating and ground confluent modulo associativity and 
commutativity of -I- (cf. the axioms A1 and A2). A ground term t is a basic term 
if there exist disjoint finite sets I and J, elements and bj of A and basic terms 
ti, for i G I and j G J such that 

t ~ ^ aAi + ^ bj (by A1 and A2). 
iei jeJ 

Every ground normal form of (A, TZ) is a basic term. 

It is well-known that the axioms A1-A5 together with M1-M4 do not con- 
stitute an (jj-complete axiomatisation; all g round substit ution instances of M5 
are derivable, while the axiom itself is not. iMoller 11 98^ 1 has shown that, in a 
setting with prefix sequential composition instead of the binary operation -, it 
suffices to add M5 to obtain an w-complete axiomatisation Isee lCroote flQQfill 
for an alternative proof). Clearly, neither xa [L a nor {x [L a)a is an instance of 
any of the axioms A1-A5 and M1-M5, so M6 q, is not derivable. However, each 
ground substitution instance of M6 q is derivable. 

Proposition 1 . If a is a finite sum of elements of A, then, for every ground 
term t, 



Al, . . . , A5, Ml, . . . , M4 h fa |J_ a Ri (t |J_ a)a. 

Consequently, in the case of binary sequential composition, the axioms A1-A5 
together with M1-M5 do not constitute an w-complete axiomatision. In the 
sequel, we shall prove that PA^ is w-complete. 



3 Basic Facts 

In every A-merge algebra, -I- and || are commutative and associative; we shall 
often implicitly make use of this. Also, we shall frequently abbreviate the state- 
ment PAa \- u u + 1 hy t ^ u] a t ^ u, then we call t a summand of u. Note 
that ^ is a partial order on the set of terms modulo in particular, if t ^ u 
and M ^ t, then t k. u. 

Lemma 2. Let a be an element of A and let t, u and v be ground terms. If 
at ^ u + V, then at ^ u or at ^ v. 
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Suppose f is a ground normal form of the system (if, TZ) and suppose that 

t ~ 'y ( A 'y ( bj , 

iei j&J 

then the degree d{t) of t is defined by d{t) = |/| + | J|. We let the degree of an 
arbitrary ground term be the degree of its unique normal form in {S,TZ). By 
dmax(^) shall denote the maximal degree that occurs in t, i.e., 

dmax(^) = niax({(i(f)} U {d„iax(^0 I there exists an a G A such that at' =4 t}). 



Lemma 3. If a is a finite sum of elements of A, then 

PA^ \- xa\\_a" ~ {x\\_ a")a, and PA^ \~ xa\\a" ~ (x 1 1 a"')a. 

Proof. It is straightforward to show by induction on n that the identity (*) 
Q,n+i ~ Q,"- II Q, is derivable from PA^; we shall use it in the proof of the first set 
of equations (**) xa a" ~ (x a")a, which is by induction on n. If n = 1, 
then (**) is an instance of M6„, and for the induction step we have the following 
derivation: 



xa [|_ ~ xa y_ (a" a) 


(by*) 


~ (xa y_ a") |J_ a 


(by M5) 


~ (x |J_ a")a |J_ a 


(by IH) 


r.((xL 


(by M6„) 


Ri (x |J_ (a" II a))a 


(by M5) 


Ri (x |J_ a”“''^)a 


(by *). 



The second set of equations is also derived by induction on n, using (**). □ 

iMilner and Moller proved that if t, u and v are ground terms such 

that t II X and u || x are bis imilar, then t and u are bisim ilar (a similar re- 
sult was obtained earlier by ICastellani and Hennessv ( 1989H in the context of 
distributed bisimulation). Also, they proved that every finite process has, up 
to bisimulation equivalence, a unique decomposition into prime components. 
Since PA 4 is a sound and complete axiomatisation for bisimulation equivalence 
teergstra and Klon. 1984 b the following two results are consequences of theirs. 

Lemma 4. If t, u and v are ground terms such that PA^i h 1 1| x ~ u || u, then 
PA 4 \-t^u. 

Definition 5. A ground term t we shall call parallel prime if there do not exist 
ground terms u and v such that PA^ h t Ri u || u. 
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Theorem 6 (Unique factorisation). Any ground term can be expressed uni- 
quely as a parallel composition of parallel prime components. 

We associate to each term t a norm [tj and a depth \t] as follows: 

[xj = [aj = 1 [a] = [x] = 1 {a€ A and x a variable) 

[x * 2/J = [xj + [yj [x * y] = [x] + [y] if * e {•, [J_, ||}; and 

[a; + yj = min{ [xj , [yj } [x + y) = max{ [x] , [y] }. 

Notice that ii t pc u, then t and u must have equal norm and depth. 

Lemma 7. If t, t' , u and u' are ground terms such that [tj = [t'J , [uj = \ u'\ 
and PA^ h tu ~ t'u' , then PA^ \- t ps t' and PA^ \- u' . 

Definition 8. Let t and t' he ground terms; we shall write t — >t' if there exists 
a G A such that at' =4 t and [f \ < [tj . We define the set red(t) of reducts oft 
as the least set that contains t and is closed under — >; if t — > t' , then we call 
t' an immediate reduct oft. 

Lemma 9. Let t be a ground term. If t — > t' and t — > t" implies that PA^ h 
t' Ri t" for all ground terms t' and t" , then there exists a parallel prime ground 
term t* such that PA^i h t r; t* 1 1 . . . 1 1 f* . 

Proof. First, suppose that u and v are parallel prime, and let u' and v' be such 
that u — > u' and v — > v'; then, u || u — > u' \\v and u || u — > u || t>'. So, if 
It' II u ~ It II u', then since [u'\ < [uj , u cannot be a component of the prime 
decomposition of u'; hence, by Theorem El u ~ u. 

Suppose t ~ II . . . II with ti parallel prime for all 1 < i < n and 

[ilj < • • • < [tn\ . 

If [tij = 1, then [ti\ = 1 for all 1 < i < n; for suppose that t' is a ground term 
such that ti — >t'i, then from t2||" • • 11^™ ~ II’ ’ ’ I l^i-i I l^t ll^t+i 1 1 ’ ’ ’ l|irt> we get by 
LemmaElthat U fv ti||t', but ti is parallel prime. From <i||- • •||ti_i||ti+i||- ■ -Wtn ~ 
ti II • • • II tj-i II II • • • II we conclude by LemmaElthat L Ri tj. 

The remaining case is that [ti\ > 1 for all 1 < i < n. Let t' and t' be 
ground terms such that ti — > t'i and tj — > t'j for some 1 < i < j < n; then by 
Lemma E| 1 1 ~ ti 1 1 t) . Since [t' J < [tij , ti cannot be a component of the 

prime decomposition of t', so by Theorem E|ti ~ tj- □ 

4 Mixed Equations 

We shall collect some results about mixed equations] these are equations of the 
form tu PC V \ \w. 

Lemma 10. If t, u and v are ground terms such that PA^ \- tu~ u\\v, then 
there exists a finite sum a of elements of A such that PA^j h u ~ for some 
A: > 1. 
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Proof. Note that \t] = [n]; we shall first prove the following 

Claim: if |"t] , = 1, then there exists a fc > 1 such that uK,t^ and t k, v. 

Let t = oi + . . . + am with oi , . . . , am G A] we proceed by induction on [itj . 
If [itj = 1, then there exists a £ A such that a ^ u, whence at ^ u\\v. Since 
oiuH — • + amU ~ M II u, there exists by Lemma|2|an i such that atu ~ av; hence 
by Lemma [7 |m ~ v. Since |"u] = 1 it follows that tu ~ u 1 1 u ~ uu ~ vu, hence by 
Lemma 0 1 ~ v. 

If [itj > I, then there exist b\, . ■ ■ ,bn & A and ground terms ui, . ■ ■ ,Un such 
that u ~ 6 iUi + . . . + bnUn- Then aiu+. . . + amU Ri tu Ri u||u ~ bi{ui ||w) + . . .+ 

bn(un II + vu, so by Lemma0rti || u ~ u, for all I < i < n. By Lemma0there 

exists u' such Ui Ri u' for all 1 < i < n, and, by A4, {bi + . . . + bn)u' Ri u r: u' 1 1 u. 
Hence by the induction hypothesis u r; 6 i + . . . + 6 „ and u' R for some fc > 1. 
So rt R , and from 

tu R UU + b\{u' II u) + • • • + bn{u' II v) 

R nu + nu (by A4) 

R vu (by A3) 

it follows, by Lemma 0 that t r u. This completes the proof of our claim. 

The proof of the lemma is by induction on |"u] . If |"t] , |"u] = 1 then f is a finite 
sum of elements of A and by our claim u~t^ for some fc > 1. If [f] , [u] >1, then 
there exists a G A and ground terms t' and v' such that av' =4 v and t'u R u 1 1 
hence, by the induction hypothesis, there exists a finite sum a of elements of A 
such that u~a^, for some fc > 1 . □ 

Lemma ca has the following consequence. 

Lemma 11. Ift, t' , u and v are ground terms such that PA^htu r t'u || v, then 
there exists a finite sum a of elements of A such that PA .4 h u R for some 
k>l. 

Lemma 12. Let a be a finite sum of elements of A; if t, u and v are ground 
terms such that PA^ h ta^ r u 1 1 u for some k > 1, then PA^ h rt r a* for some 
I < k, or there exists a ground term t' such that PA^i h u r . 

Proof. The proof is by induction on the norm of v. 

If [uj = 1, then there exists an a G A such that a ^ v, whence au =4 ta^ . If 
a ^ t, then u R and if there exists a ground term t' such that at' =4 t, then 
u R 

Suppose that [uj > I and let v' be a ground term such that [u'J < [uj and 
av' =4 V, whence a{u || u') ^ ta^. If a ^ t, then u\\v' ^ a^, hence there exists an 
I < k such that u R Otherwise, suppose that t* is a ground term such that 
at* =4 t and u\\v' R t*a^] by induction hypothesis u k, for some I < k, or 
there exists a ground term t' such that u R t'a^. □ 

iHirshfeld and .Terrum (199jjl give a thorough investigation of a particular 
kind of mixed equations; we shall adapt some of their theory to our setting. 
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Let a be a finite sum of elements of A. A ground term t we shall call a- free 
\i t ^ a and there exists no ground term t' such that t ~t' \ \ a. We shall call a 
ground term t an a-term if t r; for some fc > 1. The a-norm \t\ q, of a ground 
term t is the length of the shortest reduction of t to an a-term, or the norm of 
t if such a reduction does not exist. Note that if t Ri rt, then [tjo, = the 

a-norm of an equation is the a-norm of both sides. We shall write t — >a t' if 
t — > t' and \t'\a < \t\a, if [uJq = 1, then we say that u is an a-unit. In line 
with Definition 0 a ground term t' is an a-reduct of a ground term t if t' is 
reachable from t by an a-reduction. 

It is easy to see that [t || uJq = [tja + L^Ja> so we have the following lemma. 



Lemma 13. Let a be a finite sum of elements of A; any a- free a-unit is parallel 
prime. 

Lemma 14. If t is a-free, then ta is a-free. 

Hirshfcld a,nd .Terriim proved a variant of Lemma 0 

Lemma 15. Let a be a finite sum of elements of A, and let t be an a-free 
ground term. Ift — >a t' and t — >a t" implies that PA^ h f' r: t" for all ground 
terms t' and t" , then there exists a parallel prime ground term t* such that 
PA^I-t Rt* II ... \\t*. 

A pumpable equation is a mixed equation of the form 

(fi II • • • II tm)a^ ^ Uia'" II • • • II M„a'", 

where a is a finite sum of elements of A, fc > 1, m,n > 2 and ti and Uj are 
a- free ground terms for 1 < t < m and 1 < j < n. The following lemma occurs 
in iHirshfeld and .Terrum 11998^ as Lemma 7.2. 

Lemma 16. There are no pumpable equations with a-norm less than three. 

Proposition 17. Let t, u, u' and v be ground terms such that t and v are a-free 
and 



PA /I L (t II u)a^ R va'^ || rt'a^. (1) 

If u and u' are a-units, then PA^i h u r m'. 

Proof. If there exists a ground term u* such that m r m* 1 1 a, then by Lemma 0 
va^ II u'a^ R {t II u* II a)a^ R {t || u*)a^ || a; by Lemma IPil va'^ is a-free, 
hence there exists a ground term u** such that u' R u** \ \ a. Vice versa, from 
u' R u** II a we obtain the existence of a u* such that m r m* || a. In both cases 
(t II u*)a^ II ~ va^ || u**a^ || a, whence 
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Hence, we may assume without loss of generality that the a-units u and u' are 
a-free, so that © is a pumpable equation. By Lemma ir?H there are no pumpable 
equations with a-norm less than three, so [tja, L^Ja > 2; we prove the lemma 
by induction on [t J „ . 

If there exist ground terms t' and v' such that t — >a v — >a v' and (t' || 

u)a^ Ri v'a^ II u' ^ then we may conclude u~u' from the induction hypothesis. 
Since the a-units u'a^ and u have unique immediate a-reducts, in the case 
that remains, t and v have unique immediate a-reducts t' and v' , respectively; 
hence, by Lemma El there exists a parallel prime ground term v* such that 
Ri u* II • • • II u*. By Lemma 0 

{t' II u)a^ K, va^ II ~ (u || a^~'’*)a^, for some i > 0, 

so t' II M Ri II • • • II II a^+L Since u is a-free, whence parallel prime by 
Lemma Cl it follows that u~v*\ hence 

{t\\u)a’^ « (u II • • • II w)a'= II u'a’^ and R^ u 1 1 • • • || u || a'^+L 

Clearly, there exists a, j > k such that w'a^ || is an a-reduct of va^ || u'a^ 
(a-reduce va^ to a^). Hence, by Lemma0 {u' \ \ a^)a^ is an a-reduct of (t 1 1 u)a^, 
so m' 1 1 is an a-reduct of 1 1 1 u. If u' 1 1 a-^ is obtained by reducing t to an a-term, 
then UK. u' follows, since u and u' are a-free. Otherwise, there exists j' < j such 
that rt' II a-^ is an a-reduct of t, hence of the unique immediate a-reduct t' of 
t. Every a-reduct of t' with a-norm 1 is of the form it 1 1 a-^ . Since u and u' are 
parallel prime, u k u' follows. □ 

5 Mixed Terms 

We shall now define the set of head normal forms, thus restricting the set of terms 
that we need to consider in our proof that PA^i is w-complete. The syntactic form 
of head normal form motivate our investigation of a particular kind of terms that 
we shall call mixed terms (nestings of parallel and sequential compositions) . We 
shall work towards a theorem that certain instantiations of mixed terms are 
either parallel prime or a parallel composition of a parallel prime and for 
some finite sum a of elements of A. 

Let a; be a variable, suppose t = x or t = xt' for some term t' , and suppose 
u = ui, . . . ,Uj and v = v\,. . . ,Vj are sequences of terms; we define the set of 
x-prefixes Lj [t, u, li] inductively as follows: 

LqM = t', and 

Lj+i[t,u,Uj+i,v,Vj+i] = {Lj[t,u,v\ \luj+i)vj+i. 

A term t is a head normal form if there exist finite sets I, J, K and L such 
that 



t K UiU + Y^3 +^'<^k\lUk + E wi (by A1 and A2) 

i&I jeJ k&K l&L 
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with the Qi and bj elements of A, the ti and Uk arbitrary terms and each and 
wi an x-prefix for some variable x. 

Lemma 18. For each term t there exists a head normal form t* such that PAy^h 
t « t*. 

We shall associate with every equation t ^ u a substitution cr such that 
« ■u'^ implies that t ~ u. The main idea of our w-completenss proof is to 
substitute for every variable in t or m a ground term that has a subterm (/?„ of 
degree n, where, intuitively, n is large compared to the degrees already occurring 
in t and u. Let a be an element of A and let n > 1; we define 

(pn = a^ a^ ^ -t“ ■ • . “t“ U- 



Lemma 19. If n>2 and t is a ground term, then (pnt is parallel prime. 

Suppose t is a term, and let u = u\, . . . ,Uj and v = vi, . . . ,Vj be sequences 
of terms; we define the set of mixed terms Mj[t,u,v] inductively as follows: 

Mo[t] = t; and 

Mj+i[t,u,Uj+i,v,Vj+i] = {Mj[t,u,v] \\uj+i)vj+i. 

Let t be a ground term; we denote by dmax(^) the least upperbound for the 
degrees of all the reducts of t, i.e., 

Cax(t) = max{d(t') I t' G red(t)}. 



Definition 20. A mixed term Mj[ipnt,u,v] we shall call a generalised (p„-term 

if 

< n. 

Note that there are no generalised (^i-terms. 

Lemma 21. Let Mj[(pnt,u,v] be a generalised cpn-term and let u be a ground 
term such that 



PA A b Mj [pnt, u, w] « (fntvi ■ ■ ■ Vj \ \ U. 

Then there exists a finite sum a of elements of A such that PA^i \~ vi - ■ -vj « 
and PAyi h u « ui 1 1 • • • 1 1 for some k,l > 1. 

Proof. From Mj[(pnt,u,v] — >■ Mj[t,u,v] and d{Mj[t,u,v]) < n it follows that 
Mj [t, u, u] « tui • • • Vj 1 1 u; hence 

'^max(t^i---^j l|w) <«• (2) 
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Note that [mJ = \u\ 1 1 • • • 1 1 ttjj ; we shall prove the lemma by induction on \ u\ . 

If [uj = 1, then j = 1 and [uij = 1. By Lemma ITTI there exists a finite sum 
a of elements of A such that vi ~ for some fc > 1, and hence by Lemma \TI\ 
UK, a. By Lemma0 {‘frA || u\)a^ k ipnta^ || a ~ {(fint || a)a^, so ui k a. 

If [itj > 1, then there are three cases: [uj\ = 1 and j > 1, [uj\ > 1 and 
j = I, and [ujJ > I and j > I. We shall only treat the last case; for the other 
two cases the proof is similar. 

Let u'j be an immediate reduct of Uj] by 0 there exists an immediate reduct 
u' of u such that 



Mj Ui,..., Uj^i, u' , v] K iPntVi ■■■Vj \\u'. 

Since Mj [ipnt, ui , . . . , Uj-i, u' , n] , there exists by the induction hypothesis a finite 
sum a of elements of A such that vi ■ ■ ■ vj k and ui 1 1 • • • 1 1 uj-i 1 1 rt' ~ for 
some k,l > 1. By means of j — 1 applications of LemmaEl we find that 

Mj[(fint,u,v] K || II a* ~ ^Pnta^ || u. 

Since ipntct^ is parallel prime by Lemma [TUI there exists a ground term u* with 
[u*J = \uj\ and [m*J„ = \uj\a such that u k u* \\ and || 

Uj)a^'"^^ K ipntcA II u*. If [u*J < [ujj , then by Lemma El u* is an a-term, 
which implies that u and Uj are also a-terms. So suppose that [it*J > [yj\, 
and let be a ground term such that u* k . Since [uj\ > and 

\uj\a = by Proposition ^3 Uj and are not a-units. Since it' is an a- 

term and uj — > it' , Uj must be an a-term, so also it^ is an a-term. Consequently, 
It ~ It* II a* ~ itlai’'^J II a* is an a-term. □ 



Lemma 22. Let Mj[ipnt,u,v] be a generalised (pn-term. If Mj[ipnt,u,v] is not 
a-free, then there exists i < j such that PAq \~Vi - ■ -Vj k and Ui is not a-free. 

Proof. Let t* be a ground term such that 



Mj[p>nt,u,v]Kt* \\a. (3) 

Since <Pntvi ■ ■ -vj is a reduct of Mj Ypnt, u, il] and by Lemma El Pntvi ■ ■ -vj is 
parallel prime, ipntvi ■ ■ -vj must be a reduct of t*. Then (pntvi ■ ■ -Vj || a is a 
reduct of Mj[(pnt,u,v], so there exist sequences of ground terms u' and v' such 
that for some 1 < j' < j and 1 < i < j, 

Mj! [ipntvi ■ ■ ■ Vi-i, u', v'] K (fintvi ‘ ■ Vj 1 1 a, where v[ - ■ - Vj, k Vi - ■ ■ Vj . 

By Ijemma, 12 1 I t;. ■ ■ ■ Vj k a^, so in particular Vj k a* for some I < k. Clearly, 
[t*J > I, so by Lemma^3 there exists A such that t* k AaK We apply Lemma El 
to the right-hand side of O and cancel the a^-tail on both sides to obtain 

Mj_l[(fint,Ui, . . . ,ltj_i,l’i, . . . ,1'j-l] II Uj K t^ II a. 
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The remainder of the proof is by induction on j . If j = 1 , then 1 1 1 u ^ Ri t^\a 
implies that Uj is not a-free and we are done. If j > I and Uj is a-free, then 
Ml, . . . , Mj_i, Ml, . . . , Vj-i] is not a-free, so by the induction hypothesis 
there exists some 1 < i' < j — 1 such that Ui' is not a-free and Vi' ■ ■ ■ Vj-\ ~ a^ , 
whence Vi> ■ ■ ■ Vj rn . □ 

Proposition 23. If a generalised t* is not parallel prime, then there 

exists a finite sum a of elements of A and a parallel prime generalised ipn-term 
t^ such that t* Ri II a^ for some k>l. 

Proof. Let t* Ri Mj[ipnt,u,v] and let t\,. to be parallel prime ground terms 
such that 



Mj[ipnt,U,v] 



to- 



Since (pntvi ■ ■ - vj is a, reduct of Alj m, m] and parallel prime, (pntvi ■ ■ -Vj must 
be a reduct of some ti {1 < i < o); assume without loss of generality that it is a 
reduct of ti. 

Suppose that Mj [(pnt, u, m] is not parallel prime and let m Ri t 2 1 1 ■ • • | Uo- Since 
ipntvi • • • Mj 1 1 M is a reduct of Mj [pnt, m, m], there exist sequences of ground terms 
u' and v' such that, for some 1 < / < j and I <i < j, 

Mj' [ipntvi ■ ■ ■ Vi-i,u' , m'I Ri ifntvi ‘ ‘ ■ Vj 1 1 M, where v[ - ■ -v'j, PS Vi - ■ ■ Vj. 

So by Lemma EU there exists a finite sum a of elements of A such that u~a^, 
for some fc > 1. 

It remains to prove that if Mj[(^„t,M,M] Ri A || a, then A is a generalised 
</5„-term, for then it follows that ti is a generalised (/9„-term by induction on k. 
Since Mj[(pnt,u,v] is not a-free, there exists by Lemma l^ an i < j such that Ui 
is not a-free and Vi - - -Vj ~ a* for some / > 1. So either Ui ~ a or there exists m' 
such that Mi Ri m' 1 1 a; we only consider the second possibility, as the other can 
be dealt with similarly. By Lemma 01 we obtain 

Mj[iprit,U,v\ ~ Mj^i[LPnt,Ui, . . . . . . ,Uj,v] || a 

Hence t^ ~ Mj_i[(/3„t, mi, . . . , ut-i, m', m^+i, . . . , Uj,v] is a generalised (pn-terra.. 

□ 



6 LJ- Completeness 

Let H be a nonempty set; we shall now prove that PA^ is w-complete. We 
shall assume that the variables used in an equation t ~ u are enumerated by 
X\,X 2 , - - - ,Xk, - - - Let Xi be a variable and let m be a natural number; the par- 
ticular kind of substitutions am that we shall use in our proof satisfy 



cTm{xi) = a{aipi+m + a)a. 
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We want to choose m large compared to the degrees already occurring in t 
and u; with every term t we associate a natural number that denotes 

the maximal degree that occurs in t after applying a substitution of the form 
described above, treating the terms ipi+m as fresh constants. 

Definition 24. Suppose S = {^ 1 ,^ 2 , ■ ■ ■ ■ ■ ■} is a countably infinite set of 

constant symbols such that SDA = 0. Let t be a term and let a be a substitution 
such that 



a{xi) = a{a^i + a)a. 

We define ide maximal degree that occurs in , i.e., df^^^^it) = 

dmaxit'")- 



Lemma 25. If t is a term and let m be a natural number, then 
< dfr^axit); ^.nd 

ii. if a € A and t' is a ground term such that at' ^ then d(t') < 

If Lj[t, u, ?;] is an a;i-prefix and m is a, natural number, then 

Lj[t,U,v]'^"' )■ Mj[{a(fii+rn + a)t",U,v]'^"' ^ aMj[<p,,+rnt'' , (4) 

where t" = a ii t = Xi and t" = at' ii t = Xit' for some term t' . If m > 
d'f^,^y,{Lj\t,u,v\), then Mj[ipi+mt'' ,u,vY'^ is a generalised (/?„-term; we shall call 
it the generalised (fi+m-term associated with Lj[t,u,v] by am- 

For ground terms t and t*, let us write t 1 — > t* if there exists a ground term 
t' and an a G A such that t — > t' ^ at* . 

Lemma 26. Let t be an x-prefix and suppose that m > dmax(^)- If n > m and 
t* and A are generalised ip^-terms such that 1 — >■ t* and f*™ 1 — >■ Y , then 

PAA^t* PS ft. 

Proof. Note that the unique immediate reduct of is of the form Mj[{a(pi+m + 

a)t',u,vY”'. Moreover, m > dj^ax(^)> so Mj[(pi+mt' ,u,vY”' is the unique ground 
term t* such that at* =4 Mj[{aipiAm + a)t',u,vY"‘ and d(t*) > m. Hence, if t* 
is any generalised (/^n-term with n > dmax(^) such that !->■ t*, then t* ~ 
AIj[l^i+mt , U, v] 

Note that if t is an cc-prefix, m > dmax(^) i* is the generalised ipiAm-ierm 
associated with t by am, then t* has no reduct with a degree in {m + l,m + 
2, . . . ,m + i — 1}. Lemma has the following consequence. 

Lemma 27. Let t be an x-prefix, let u be a y-prefix and let m > max{<i^aa;(i)) 
^max(^)}- I.f PAyl h t'*"' PS u'*”' , then X = y. 

We generalise the definition of a-freeness to terms with variables: a term t is 
a-free if t 9 ^ a and there exists no term t' such that t ps t' \ \ a. 
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Theorem 28 ( w-completeness). Let A be a nonempty set; then PA^ is ui- 
eomplete, i.e., for all terms t and u, 

tf PAyi \- Ki u'^ for all ground substitutions a, then PA^i \- t u. 

Proof. Let m > max{c?^^^(t), We shall prove by induction on the 

depth of t that if ~ then t k, u] clearly, this implies the theorem. 

In the full version of this paper we simultaneously prove that one may assume 
without loss of generality that the generalised i^„-term associated by am with 
an x-prefix is parallel prime; we need this assumption in the remainder of the 
proof. 

Suppose that ~ it suffices to show that every summand of u is a 

summand of t, for then by a symmetric argument it follows that every summand 
of t is a summand of u, whence t k, u. There are four cases: 

1. li b G A such that b ^ u, then b ^ t since (Tm-instances of summands of one 
of the three other types have a norm > 1. 

2. li a G A and u' is a term such that au' =4 u, then by Lemma ESI d(u') < 

dm^yfu). Since m > dmg^{u), cannot be an immediate reduct of a 

(Tm-instance of an x-prefix or of a term x w, with v an x-prefix. So there 
exists i G I such that a{u'Y^ ~ aitf”'. By the induction hypothesis u' k, ti, 
hence au' =4 t. 

3. Let V be an x-prefix and let u' is a term such that v \\_u' ^ u. By our 
assumption that generalised (^„-terms associated by am with x-prefixes are 
parallel prime, there exists k G K such that ~ vfY and {u'Y^ ~ 

by Lemma E3 Vk is also an x-prefix. Hence, by the induction hypothesis 
V PS Vk and u' ps Uk, so v \ \_u ^ t. 

4. If w is an x-prefix such that w u, then by our assumption that generalised 

V^n-terms associated by am with x-prefixes are parallel prime, there exists 
I G L such that w®’'" ~ by Lemma E7| wi is also an x-prefix. If the 

generalised :/5„-term associated to w by am is of the form ipnw', then clearly 
the generalised (pnAerm. associated to wi by am must be of the form ipnw'i 
and it is immediate by the induction hypothesis that w' ~ w'l and w wi. 
Let w = {f W_ u')v' and let wi = {t" [|_ u")v", where t' and t" are x-prefixes 
to which am associates parallel prime generalised :/3„-terms Y and Y. If 

, then by the induction hypothesis {Y |J_ u') fn {t" |J_ u") 
and v' ~ v" , whence w m wi. So let us assume without loss of generality that 
< \{v"Y”"^', then there is a ground term v* such that {Y [[ u'Y"' ~ 
{t" U_ ^'Y^'v* v*{v’Y"' ~ {Y'Y”'- Note that {Y || u"Y"^v* « {Y || u’Y”', 

which is not parallel prime. So there exists by Proposition 12,41 a finite sum 
a of elements of A and a parallel prime generalised (pn-term t* such that 
{Y II {u"Y”'Y* ~ t* II . Hence by Lemma E3 ~ cY for some I > 1. 
Consequently, v" ~ Yv' and by the induction hypothesis Y\\_u' ~ {t"W_u")Y; 
hence, w ^ t. □ 



An tj-Complete Equational Specification of Interleaving 743 



References 

Aceto, L. and Hennessy, M. (1993). Towards action-refinement in process alge- 
bras. Information and Computation, 103(2), 204-269. 

Baeten, J. C. M. and Weijland, W. P. (1990). Process Algebra. Number 18 
in Cambridge Tracts in Theoretical Computer Science. Cambridge University 
Press. 

Bergstra, J. A. and Klop, J. W. (1984). Process algebra for synchronous com- 
munication. Information and Control, 60(1-3), 109-137. 

Castellani, I. and Hennessy, M. (1989). Distributed bisimulations. Journal of 
the ACM, 36(4), 887-911. 

Groote, J. F. (1990). A new strategy for proving w-completeness applied to pro- 
cess algebra. In J. Baeten and J. Klop, editors. Proceedings of CONCUR’90, 
volume 458 of LNCS, pages 314-331, Amsterdam. Springer- Verlag. 

Heering, J. (1986). Partial evaluation and w-completeness of algebraic specifica- 
tions. Theoretical Computer Science, 43, 149-167. 

Hennessy, M. (1988). Axiomatising finite concurrent processes. SIAM Journal 
of Computing, 17(5), 997-1017. 

Hirshfeld, Y. and Jerrum, M. (1998). Bisimulation equivalence is decidable for 
normed process algebra. LFCS Report ECS-LFCS-98-386, University of Ed- 
inburgh. 

Hirshfeld, Y. and Jerrum, M. (1999). Bisimulation equivalence is decidable for 
normed process algebra. In J. Wiedermann, P. van Emde Boas, and M. Nielsen, 
editors. Proceedings of ICALP’99, volume 1644 of LNCS, pages 412-421, 
Prague. Springer. 

Lazrek, A., Lescanne, P., and Thiel, J.-J. (1990). Tools for proving inductive 
equalities, relative completeness, and w-completeness. Information and Com- 
putation, 84(1), 47-70. 

Milner, R. (1989). Communication and Concurrency . Prentice-Hall Interna- 
tional, Englewood Cliffs. 

Milner, R. and Moller, F. (1993). Unique decomposition of processes. Theoretical 
Computer Science, 107, 357-363. 

Moller, F. (1989). Axioms for Concurrency. Ph.D. thesis. University of Edin- 
burgh. 

Moller, F. (1990). The importance of the left merge operator in process algebras. 
In M. Paterson, editor. Proceedings of ICALP’90 , volume 443 of LNCS, pages 
752-764, Warwick. Springer. 




A Complete Axiomatization for Observational 
Congruence of Prioritized Finite-State Behaviors 



Mario Bravetti and Roberto Gorrieri 



Universita di Bologna, Dipartimento di Scienze dell’Informazione 
Mura Anteo Zamboni 7, 40127 Bologna, Italy 
{bravetti, gorrieri}(5cs .unibo . it 



Abstract. Milner’s complete proof system for observational congruence 
is crucially based on the possibility to equate r divergent expressions to 
non-divergent ones by means of the axiom recX.ij.X -\- E) = recX.r.E. 
In the presence of a notion of priority, where e.g. actions of type 5 have a 
lower priority than silent r actions, this axiom is no longer sound because 
a 5 action performable by E is pre-empted in the left-hand term but not 
in the right-hand term. The problem of axiomatizing priority using the 
standard observational congruence has been open for a long time. Here we 
show that this can be done by introducing an auxiliary operator pri{E), 
by suitably modifying the axiom above and by introducing some new 
axioms. Our technique provides a complete axiomatization for Milner’s 
observational congruence over finite-state terms of a process algebra with 
priority and recursion. 



1 Introduction 

In the last years the expressiveness of classical process algebras has been ex- 
tended in several ways. Often such extensions have led to the necessity of intro- 
ducing a notion of priority among actions which is useful, e.g., to model mecha- 
nisms of pre-emption. The problems arising from expressing priority have been 
previously studied in the context of prioritized process algebras, see e.g. IbllOldl 
and ^ for a survey. One of the open questions in this context (see 0) is find- 
ing a complete axiomatization for observational congruence in the presence of 
recursion. 

In P], where process algebras with priority are studied in full generality, the 
authors show that it is necessary to consider rather complex equivalence notions, 
which deviate from Milner’s standard notion of observational equivalence, in 
order to get the congruence property. Here, we consider a less general form of 
priority for which the equivalences presented in ^ reduce to Milner’s standard 
notion of observational congruence. This allows us to focus on the problem of 
finding a complete axiomatization for observational congruence in the presence 
of priority and recursion. In particular the process algebra we consider is the 
algebra of finite-state agents used by Milner in jOj for axiomatizing observational 
congruence in presence of recursion, extended with S prefixing, where 6 actions 
have lower priority than silent r actions. A language like this has been studied 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 744-|7^^ 2000. 
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also in |^, where 5 actions represent non-zero time delays, classical actions of 
CCS are executed in zero time, and the priority of r actions over 6 actions 
derives from the maximal progress assumption (i.e. the system cannot wait if it 
has something internal to do). As in |5ll()ldl4lr)) we assume that visible actions 
never have pre-emptive power over lower priority S actions, because we see visible 
actions as indicating only the potential for execution. On the other hand, as 
observed e.g. in in the presence of a restriction operator this assumption is 
necessary to get the congruence property. 

In the literature complete axiomatizations of equivalences for algebras with 
priority have been presented for recursion free processes (e.g. in EOi). For this 
class of processes a notion of priority of r actions (silent actions) over actions 
of a type 6 is simply axiomatized by adding the axiom t.E + S.F = t.E to the 
standard Milner’s complete proof system for observational congruence jBEI. 

When we consider also recursive processes, if we simply try to extend Milner’s 
proof system for observational congruence jS| with the axiom above, we do not 
obtain a correct axiomatization. This because the axiom: 



recX.(r.X + E) = recX.r.E (*) 

which is part of the proof system of [^, is no longer sound. For example if 
E = 6.E' , the right-hand term can (weakly) perform 5, whilst the left-hand one 
cannot because of the priority of r actions over S actions. The axiom above is 
crucial for the completeness of the axiomatization in that it allows to “escape” r 
divergence in weakly guarded recursive terms, during the process of turning them 
into strongly guarded ones. The axiom reflects the possibility of observational 
congruence to always equate r divergent expressions to non-divergent ones. 

The effects of priority on the possibility of escaping divergence have been pre- 
viously studied in |0|, where the problem of finding a complete axiomatization 
for recursive processes is tackled by considering a variant of Milner’s standard 
notion of observational congruence, that makes it not always possible to escape 
T divergence. In particular Milner’s observational congruence is modified with 
the additional requirement that two bisimilar terms must have the same oppor- 
tunity to silently become stable terms, i.e. terms that cannot perform r actions. 
Adopting this additional constraint, which now makes observational congruence 
certainly sensitive to r divergence, provides a simple necessary and sufflcient con- 
dition for the possibility of escaping divergence and for applying the axiom (*). 
Divergence can be escaped in recX.^r.X -|- E) if and only if E may perform a 
silent move. 

In this paper we show that priority and Milner’s standard notion of obser- 
vational congruence are compatible: we obtain a complete axiomatization for 
an algebra with priority and recursion even if it is always possible to escape r 
divergence. This is done by replacing E in the right-hand term of axiom (*) 
with pri{E), where pri is an auxiliary operator. The behavior of pri(E) is ob- 
tained from that of E by removing all transitions 6 (and subsequent behaviors) 
performable in state E. Therefore the new axiom is: 



recX.(r.X + E) = recX.T.pri(E) 
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Since E may be a term including free variables (e.g. in the case of nested re- 
cursions), in our axiomatization an important role in the process of turning un- 
guarded recursive terms into strongly guarded ones is played by terms pri{X). 
In particular we introduce the new axiom: 

recX.{T.{pri{X) + E) + F) = recX.{T.X + E + F) 

which is needed to remove unnecessary occurrences of terms pri{X) in weakly 
guarded recursions. Finally we add some basic axioms that capture the 5 elimi- 
nating behavior of the new auxiliary operator. 

In this way we provide a complete axiomatization for observational congru- 
ence over processes of the process algebra with priority. 

The paper is structured as follows. In Sect. 2 we present the algebra with 
priority and its operational semantics. In Sect. 3 we present a complete axiom- 
atization for observational congruence over processes of the algebra. Finally in 
Sect. 4 we report some concluding remarks. 

2 Prioritized Algebra 

The prioritized algebra that we consider is just CCS without static operators PI 
extended with 5 prefixing, where 5 actions have lower priority than silent r 
actions. The set of prioritized actions, which includes the silent action r denoting 
an internal move, is denoted by PAct, ranged over by a, 6, c, . . . . The set of all 
actions is defined by Act = PAct\j{5}, ranged over by a,a' , . . . . The set of term 
variables is Var, ranged over by X, T, . . . . The set £ of behavior expressions, 
ranged over by E,F,G,. . . is defined by the following syntax. 

E::=0\X\a.E\E + E\ recX.E 

The meaning of the operators is the standard one of |MD] . where recX denotes 
recursion in the usual way. 

As in PI we take the liberty of identifying expressions which differ only by 
a change of bound variables (hence we do not need to deal with a-conversion 
explicitly). We will write E{F/X} for the result of substituting F for each 
occurrence of X in E, renaming bound variables as necessary. 

We adopt the standard definitions of m for free variable, and open and 
closed term. The set of processes, i.e. closed terms, is denoted by V, ranged over 
by P,Q,R,.... 

We adopt the following standard definitions concerning guardedness of vari- 
ables. 

Definition 1. X is weakly guarded in E if each occurrence of X is within some 
subexpression of E of the form a.F. X is (strongly) guarded in E if we addi- 
tionally require that a ^ t. X is unguarded in E if X is not (strongly) guarded 
in E. X is fully unguarded in E if X is neither strongly nor weakly guarded in 
E. m 
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a.E 



OL 



E 



E E' 

E + E E' 

E E' F 
E + F E' 

E 

recX.E 



F > F' 

E + F F' 

F F' E 
E + F F' 

-)■ E' 

''{recX.E/X} 



Table 1. Operational Semantics 



A recursion recX.E is weakly guarded, (strongly) guarded, unguarded, fully 
unguarded, if X is weakly guarded, (strongly) guarded, unguarded, fully un- 
guarded in E, respectively. As in jOI we say that an expression E is guarded 
(unguarded) if every (some) subexpression of E which is a recursion is guarded 
(unguarded) . 

The operational semantics of the algebra terms is given as a relation — > C 
£ X Act X £. We write E F for (A, a, F) G — > , E for BE : {E, a, F) G 

Ct 

— >■ and E for ~^F : {E,a,F) G — >■ . — >■ is defined as the least relation 
satisfying the operational rules in Table E 

As in mg we capture the priority of r actions over <5 actions by cutting 
transitions which cannot be performed directly in semantic models (and not by 
discarding them at the level of bisimulation definition as done in jS] ) so that we 
can just apply the ordinary notion of observational congruence P] . 

Note that the rule for recursion is not the standard one used by Milner 
in jSl?], in that it defines the moves of recX.E in a pure structural way, starting 
from those of E (where X occurs free), instead of E{recX.E / X} . This rule, 
which was proven to be equivalent to that of Milner in [Q, gives the possibility 
to simplify the proofs of some results. In particular such proofs can be made by 
simply inducing on the structure of terms instead of inducing on the depth of 
the inference of term transitions. 

The equivalence notion we consider over the terms of our prioritized process 
algebra is the standard notion of observational congruence extended to open 
terms 0?]. In particular we consider here the following characterization of ob- 
servational congruence given in j^]. 

As in PI we use to denote computations composed of all tran- 
sitions whose length is at least one and to denote computations com- 
posed of all transitions whose length is possibly zero. Let denote 
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— . Moreover we define ii a t and Let 

E \> X denote that E contains a free unguarded occurrence of X. 

Definition 2. A relation B C £ x £ is a weak bisimulation if, whenever 
{E,F)&fi: 

— If E E' then, for some F' , F F' and {E' , F') G fd. 

— If F F' then, for some E' , E E' and {E' ,F') G j3. 

-Et>X iff Ft>X. 

Two expressions E, F are weakly bisimilar, written E k. F, iff {E, F) is included 
in some weak bisimulation. ■ 



Definition 3. Two expressions E, F are observational congruent, written 
E-F, iff: 

— If E -Eg E' then, for some F' , F F' and E' k, F' . 

— If F -Eg F' then, for some F' , F F' and F' k. F' . 

-F>X iff Ft>X. ■ 

Corollary 4. If F ex F then: 

F ^ F EX, ■ 

The following theorem shows that the presence of priority preserves the con- 
gruence property of observational congruence w.r.t. the operators of the algebra. 

Theorem 5. ~ js a congruence w.r.t. prefix, choice and recursion operators. 

Proof In the case of the prefix operator the proof is trivial. As far as the choice 
operator is concerned, the only problematic case arises when, given F cx F, we 
have a move E G — ^ H derived from G — ^ H. In this case we can conclude 
that F -I- G — ^ H by using Corollary 0 See PI for a detailed proof which also 
includes the case of the recursion operator. ■ 

3 Axiomatization 

We now present an axiomatization of ~ which is complete over processes P GV 
of our algebra. 

As we already explained in the introduction, the law of ordinary CCS which 
allows to escape r divergence: 

recX.fr. X E) = recX.r.F 

is not sound in a calculus with this kind of priority. Consider for instance the 
divergent term F = recX.fr.X -\- 5.0). Because of priority of “r” actions over 
“5” actions the operational semantics of F is isomorphic to that of recX.r.X. 
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Hence F is an infinitely looping term which can never escape from divergence 
by executing the action “<5” . If the cited law were sound, we would obtain F = 
recX.T.S.Q and this is certainly not the case. 

In order to overcome this problem in the distinguished symbol “_L” is 
introduced, which represents an ill-defined term that can be removed from a 
summation only if a silent computation is possible. In this way by considering 
the rule recX.(T.X + E) = recX.r.(E -|- _L) the resulting term which escapes 
divergence can be turned into a “normal” term only if E may execute a silent 
move. 

This law is surely sound (over terms without “_L”) also in our language, but 
is not sufficient to achieve completeness. Since, differently from jS], we do not 
impose conditions about stability in our definition of observational congruence, 
we can escape divergence not only when E includes a silent move but for all 
possible terms E. For example in our calculus (but not in |B|) the term recX.r.X 
is equivalent to r.O (as in standard CCS), so we can escape divergence in F even 
if T.X has not a silent alternative inside recX. In general the behavior of E' such 
that recX.T.E' = recXfr.X + E) is obtained from that of E by removing all “5” 
actions (and subsequent behaviors) performable in state E. We denote such E' , 
representing the “prioritized” behavior of state E, with pri{E). The operational 
semantics of the auxiliary operator pri is just: 

E ^ E' 

5 

pri{E) E' 

Therefore in our case r divergence can always be escaped by turning E into 
pri{E) and the strongly guarded terms we obtain are always “well-defined” 
terms. 

Note that the introduction of this auxiliary operator is crucial for being able 
to axiomatize the priority of r actions over 6 actions in our case. Since we have 
to remove S actions performable by a term E even if E does not include a silent 
move, we cannot do this by employing a special symbol like “T” instead of 
using an operator. This because T must somehow be removed at the end of the 
deletion process (in [HI T is eliminated by silent alternatives) in order to obtain 
a “normal” term. 

The axiomatization of we propose is made over the set of terms £pri, 
generated by extending the syntax to include the new operator pri(E). We start 
by noting that the congruence property trivially extends to this operator. 

Theorem 6. is a congruence w.r.t. the new operator pri (E). 

We adopt the following notion of serial variable, which is used in the axiom- 
atization. 



Definition 7. X is serial in E £ Spri if every subexpression of E which contains 

X, apart from X itself, is of the form a.F, E' + E" or recY.F for some variable 

Y. m 
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{Al) 


E + F = F + E 


{A2) 


{E + F) + G ^ E + {F + G) 


(A3) 


E + E = E 


(A4) 


E + 0^ E 


{Taul) 


a.r.E = a.E 


(Tau2) 


E + t.E = t.E 


(Tau3) 


a.{E + t.F) + a.F = a.{E -1- t.F) 


(Pril) 


pri{0) — 0 


(Pri2) 


pri(a.E) = a.E 


(Pri3) 


pri{S.E) = 0 


(PriA) 


pri{E + F) = pri{E) + pri{F) 


(Pri5) 


pri{pri{E)) = pri{E) 


(PriG) 


t.E + E = t.E -1- pri{F) 


(Reel) 


reeX.E = E{recX.E/X} 


(Rec2) F = E{F/X} F = reeX.E provided that X is strongly guarded 

and serial in E 


(Ungl) 


recX.{X + E) = reeX.E 


(Ung2) 


recX.{T.X + E) = recX.{T.pri{E)) 


(UngS) 


recX.{T.{X + E) + F) = recX.{T.X + E + E) 


(UngA) recX.{T.{pri{X) + E) + F) = recXfr.X + E + F) 



Table 2. Axiom system A 



The axiom system A is formed by the axioms presented in Table El 
The axiom (Pri6) expresses the priority of r actions over S actions. Note 
that from (Pri6) we can derive t.E + S.E = t.E by applying (PriS). The ax- 
ioms (Reel), (Rec2) handle strongly guarded recursion in the standard way j0|. 
The axioms (Ungl) and (Ung2) are used to turn unguarded terms into strongly 
guarded ones similarly as in 0. The axiom (UngS) and the new axiom (UngA) 
are used to transform weakly guarded recursions into the form required by the 
axiom (Ung2), so that they can be turned into strongly guarded ones. In par- 
ticular the role of axiom {UngA) is to remove unnecessary occurrences of terms 
pri{X) in weakly guarded recursions. 

3.1 Soundness 

Theorem 8. Given E, F G £pri, if A\~ E = F then E a F. 
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Proof The soundness of the laws (i?ecl) and (i?ec2) derives from the fact that 
systems of equations of the form Xi = Ei, where the variables Xi are guarded and 
serial in terms Ei and are not in the scope of any recursion, have a unique solution 
up to provable equality. The proof of this fact is a straightforward adaptation 
of the proof given in 0. The soundness of the new equations {Pril) — (Pri6) 
trivially derives from the fact that the semantic models of left-hand and right- 
hand terms are isomorphic. The soundness of the laws (Ungl) — (UngA) is proved 
in |2|. ■ 



3.2 Completeness 

Now we will show that the axiom system A is complete over processes of P. In 
order to do this we follow the lines of jOl, so we deal with systems of recursive 
equations. 

We start by introducing the machinery necessary for proving completeness 
over guarded expressions E G £. Afterwards we will show how the axioms (Ung) 
can be used to turn an unguarded processes of P into a guarded process of P. 
Note that the new operator pri{E) is used only in the intermediate steps of the 
second procedure. 

Definition 9. An equation set with formal variables X = {Xi, . . . , X„} and 
free variables W = {Wi, . . . ,Wm}, where X and W are disjoint, is a set S = 
{Xi = Hi \ 1 < i < n} of equations such that the expressions Hi (1 < i < n) 
have free variables in X UW. ■ 

We say that an expression E provably satisfies S if there are expressions 
E = {El, . . . , En} with free variables in W such that Ei= E and for 1 < i < n 
we have A\~ Ei = Hi{E/X{. 

The equation sets that are actually dealt with in the proof of completeness 
of 0 belong to the subclass of standard equation sets. Here we have to slightly 
change the characterization of this subclass because of the presence of priority. 

Definition 10. An equation set S = {Xi = Hi \ 1 < i < n}, with formal vari- 
ables X = {Xi, . . . , X„} and free variables W = {W±, . . . , Wm}, is standard if 
each expression Hi (1 < i < n) is of the form: 0 



where: 



Hi — aij.Xfi^ij) + 

j&Ji fcGifi 

G Ji : ey.i,j — T Ifj G Ji . O^i^j = S . 



As in 0, for a standard equation set S we define the relations — >-s C 
X X Act X X and >s C A x IT as follows: 

Xi -^s iff oi.X occurs in Hi 

Xil>sW iff W occurs in Hi 



^ We assume A = 0 if J = 0. 
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Definition 11. A standard equation set S with formal variables X = 
{Xi , . . . , Xn} is guarded if there is no eyele Xi — Xi. ■ 

The following theorem guarantees that from a guarded expression E G £ we 
can derive a standard guarded equation set which is provably satisfied by E. 

Theorem 12. Every guarded expression E G £ provably satisfies a standard 
guarded equation set S. 

Proof The proof is as in |^, where, in addition, we modify the resulting equation 
set by eliminating the occurrences of S.X (for some X) in the equations which 
include terms t.Y (for some Y). This is done by using laws {Pri6) and (Pri3). ■ 

Once established the form of standard guarded equation sets S, completeness 
over guarded expressions is obtained by saturating equation sets S' as in |2j. In 
particular the proofs of the following lemma and two theorems are the same as 

in Pj. 

Definition 13. A standard equation set S with formal variables X is saturated 
if, for all X G X: 

(i) X X' ^ X X' 

{ii) X [>s W ^ Xi>sW 



Lemma 14. Let E G £ provably satisfy S , standard and guarded. Then there is 
a saturated, standard and guarded equation set S' provably satisfied by E. ■ 

The possibility of saturating standard and guarded equation sets S leads to 
the following theorem. 

Theorem 15. Let E G £ provably satisfy S, and F G £ provably satisfy T, 
where both S and T are standard, guarded sets of equations, and let E ~ F . 
Then there is a standard, guarded equation set U provably satisfied by both S 
and T. ■ 



Theorem 16. Let E, F G £ provably satisfy the same standard guarded equation 
set S, then A\~ E = F. ■ 

Hence we have proved completeness over guarded expressions. 

Theorem 17. Lf E and F are guarded expressions of £ and E F then A b 
E = F. ■ 

Now we show that each unguarded process can be turned into a guarded 
process of V, so that we obtain completeness also over unguarded processes. We 
start with a technical lemma which, in analogy to |S|, is important to obtain this 
result. 
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Lemma 18. If X occurs free and unguarded in E G £ , then A\~ E = X + E. 



The proof of this lemma is the same as in 0 ■ 

Theorem 19. For each process P G V there exists a quarded P' G V such that 
A'r P = P'. 



Proof We show, by structural induction, that given an expression if G it is 
possible to find an expression P G £pri such that: 

1. if pri{G) is a subexpression of F then G = X for some free variable X\ 

2. for any variable X, pri(X) is weakly guarded in F, i.e. each occurrence of 
pri(X) is within some subexpression of F of the form a.G; 

3. a summation cannot have both pri{X) and Y as arguments, for any (possibly 
coincident) variables X and Y\ 

4. for any variable X, each subterm recX.G of F is (strongly) guarded in E, 
i.e. each occurrence of recX.G is within some subexpression of E of the form 
ot.H, with a t; 

5. F is guarded; 

6. A^ E = F. 

Note that a consequence of property 4 is that each unguarded occurrence of 
any free variable X of F does not lie within the scope of a subexpression recY.G 
of F. 

Showing this result proves the theorem, in that if F G f is a process of 
P, i.e. a closed term, we have (by the properties of F above) that F is also a 
process of P, it is guarded, and A \~ E = F. The result above is derived by 
structural induction on the syntax of an expression E G £. In particular in the 
case E = recX.E' a fundamental role is played by Lemma El which generates 
from a term E" such that X (pri{X)) is unguarded in E" an equivalent term 
X -I- F" {pri{X) + E”) so that law (Ung3) (law {UngA)) can be applied. See | 2 | 
for a complete proof. ■ 

From Theorem m and Theorem rrni we derive the completeness of A over 
processes of P. 

Theorem 20. Given P,QgP, if P Q then A\~ P = Q. ■ 

Note that all the axioms of A are actually used in the proof of complete- 
ness. In particular in the proof of completeness over guarded expressions (The- 
orem EJ we employ the standard axioms (Al) — (A4), (Foul) — {Tau3) and 
(i?ecl), (i?ec2) as in jOj, plus the new axioms (Pri3) and (Pri6). All these ax- 
ioms are necessary even if we restrict to consider completeness over guarded 
processes only. Moreover, proving that a process of P can always be turned 
into a guarded process (Theorem II bll requires the use of the remaining axioms 
(Pril) , {Pri2) , {PriA) , (Pri5) and (Ungl) — (UngA) 0- This supports the claim 
that our axiomatization is irredundant. 
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4 Conclusion 

The algebra presented in this paper and its axiomatization can be extended 
to include all the operators of CCS, by employing operational rules like those 
presented in This can be done easily if parallel composition is assumed to 
have an operational semantics implementing local pre-emption (and not global 
pre-emption, see ^). This means that r actions of a sequential process may 
pre-empt only actions S of the same sequential process. For instance in t.E\S.F 
the action S of the righthand process is not pre-empted by the action r of the 
lefthand process, as instead happens if we assume global pre-emption. On the 
other hand, local pre-emption seems to be natural in the context of distributed 
systems, where the choices of a process do not influence the choices of another 
process. 

If global pre-emption is, instead, assumed, then Milner’s standard notion of 
observational equivalence is not a congruence for the parallel operator (see 0) 
with the usual operational rule, where E\F may perform a <5 action (originating 
from E or F) only if it cannot execute any r action. This because, e.g., recX.r.X 
is observationally congruent to r.O, but recX.T.X\S.O, whose semantics, due to 
global pre-emption, is that of recX.r.X , is not observationally congruent to 
r.O 1 5.0, whose semantics is that of r.5.0. In this case a possibility is to resort to 
a finer notion of observational congruence similar to that presented in jS| . 

On the other hand, when global priority derives from execution times asso- 
ciated with actions as in jHj, where 5 actions represent non-zero time delays and 
classical actions of CCS are executed in zero time, the operational rule for the 
parallel operator implementing global pre-emption of ^ does not seem the most 
natural one. In this context a process like recX.r.X represents a Zeno process 
which executes infinite r actions in the same time point. According to the oper- 
ational rule for parallel operator of 0, the semantic model of recX.r.X\5.Q, as 
already stated, is that of recX.r.X, where we observe only the r moves of the 
lefthand Zeno process which are all executed in zero time. This is in contrast 
with the intuition that in recX.r.X\6.0 the righthand process should eventually 
advance because, since no synchronization occurs between the two processes, 
the behavior of the righthand process should not be influenced from that of the 
lefthand process. The point is that the infinite sequence of r moves executed by 
the lefthand process prevents the time from advancing and as a consequence, 
by employing the operational rule of P], we do not observe the behavior of the 
system after the time point where the r actions are executed. This leads to an 
incomplete description of the behavior of recX.r.X\6.0 that makes the semantics 
of recX.r.X\6.0 to be different from that of r.0|5.0. Therefore it seems that the 
real point here is not to consider a finer notion of observational congruence that 
makes recX.r.X not equivalent to r.O (as done in jOj), but to have an opera- 
tional semantics for the parallel operator that allows to observe the behavior of 
the system after the critical time point. The definition of such a semantics is an 
open problem which we are currently working on. 
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Abstract. Consider the problem of sending a single message from a 
sender to a receiver through an m x n mesh with asynchronous links 
that may stop working, and memoryless intermediate nodes. We prove 
that for m € 0(1), it is necessary and sufficient to use packet headers 
that are O(loglogn) bits long. 



1 Introduction 

Protocols that send information bundled into packets over a communication 
network allocate some number of bits in each packet for transmitting control 
information. We here refer to such bits as header bits. These bits might include 
sequence numbers to ensure that packets are received in the correct order, or 
they might contain routing information to ensure that a packet is delivered to its 
destination. When the number of message bits in a packet is small (for example, 
in acknowledgements), the header bits can make up a significant fraction of the 
total number of bits contained in the packet. A natural question to ask is the 
following: how large must packet headers be for reliable communication? 

This problem is addressed in part of a large body of research on the 

end-to-end communication problem IAAF+941 , IAAG+971 , !AMS89j . 

?APV96j . pra], !K0R95| . piT98| . [£21]. The end-to-end communication 
problem is to send information from one designated processor (the sender S) 
to another designated processor (the receiver R) over an unreliable communi- 
cation network. This is a fundamental problem in distributed computing, since 
(a) communication is crucial to distributed computing and (b) as the size of a 
network increases, the likelihood of a fault occurring somewhere in the network 
also increases. 

* A full version appears at http://www.dcs. Warwick. ac.uk/'leslie/papers/endtoend.ps. 
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Adler and Fich |Ah'H9| studied the question of how many header bits are 
required for end-to-end communication in the setting where links may fail. They 
prove that, for the complete network of n processors or any network that con- 
tains it as a minor (such as the n^-input butterfly or the n x n x 2 mesh), any 
memoryless protocol that ensures delivery of a single message using headers with 
fewer than |"log 2 n] — 3 bits, generates an infinite amount of message traffic. 

If there is a path of live links from S' to i? in an n-node network, then there 
is a simple path of live links of length at most n — 1. Therefore, it suffices to 
use the simple “hop count” algorithm lESH which discards messages that have 
been forwarded n — 1 times. Since this can be done with headers of size [logn], 
for the complete graph we have upper and lower bounds that match to within a 
small additive constant, and for the n^-input butterfly and the n x n x 2 mesh 
to within a small multiplicative constant. 

However, for several graphs there remains a large gap between the best upper 
and lower bounds. Planar graphs, including two-dimensional meshes, do not 
contain a complete graph on more than 4 nodes as a minor and, as a result, 
no previous work has demonstrated a lower bound larger than a constant for any 
planar graph. Furthermore, for some graphs it is possible to do better than the 
simple hop count algorithm. For example, Adler and Fich observed that 

if F is a feedback vertex set of the underlying graph G (that is, if every cycle of G 
contains at least one vertex of F), then one can use a variant of the hop count 
protocol which discards messages that have visited F more than |F| times. The 
discarding does no harm, since a simple path visits F at most |F| times. But it 
ensures that the amount of traffic generated is finite. Note that in this variant of 
the hop count protocol, the length of packet headers is at most |"log 2 (|F| -1-1)]. 

However, some graphs have no small feedback vertex sets. In particular, any 
feedback vertex set for the mxn mesh has size at least [m/2j • [n/2j . In this case, 
this variant does not offer significant improvement over the hop count algorithm. 

Thus we see that a network that has resisted both lower bound and upper 
bound improvements is the two-dimensional mesh. Prior to this work, there was 
no upper bound better than O(logmn), nor lower bound better than 17(1), for 
any mxn mesh with m, n > 2. For m = 2, headers of length one suffice in our 
network (since no backward move is needed and we need only distinguish vertical 
and horizontal arrivals) [AFDDj . In [A Ktlt)) . it is conjectured that l7(logn) header 
bits are necessary for a protocol to ensure delivery of a single message in an n x n 
mesh without generating an infinite amount of message traffic. 

Here, we attack this open problem by considering mxn meshes, for con- 
stant m > 3. We prove the unexpected result that 0(loglogn) bit headers are 
necessary and sufficient for such graphs. 



1.1 Network Model 

We model a network by an undirected graph G, with a node corresponding to 
each processor and an edge corresponding to a link between two processors. 
Specifically, we consider the graphs G(m, n) with a sender node S and a receiver 
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node R in addition to the mn intermediate nodes, for 0 < i < m and 

0 < j < n. There are links between 

— node S and node {i, 0), for 0 < i < m, 

— node (i,j) and node {i,j + 1), for 0 < i < m and 0 < j < n — 1, 

— node (i, j) and node (i + 1, j), for 0 < i < m — 1 and 0 < j < n, and 

— node (i, n — 1) and node i?, for 0 < i < m. 

The graph G(3, 6) is illustrated in Figure D 




Fig. 1. The graph G(3, 6) 



Processors communicate by sending packets along links in the network. Each 
packet consists of data (i.e. the message) and a header. The processor at an 
intermediate node may use information in the header to determine what packets 
to send to its neighbours, but they cannot use the data for this purpose. Headers 
may be modified arbitrarily; however, data must be treated as a “black box”. 
That is, processors may make copies of the data, but they may not modify it. 
This data- oblivious assumption is appropriate when one views end-to-end com- 
munication protocols as providing a reliable communication layer that will be 
used by many different distributed algorithms. Typically in end-to-end commu- 
nication, one assumes that when a processor receives a packet, it cannot detect 
which of its neighbours sent the packet. This assumption is not relevant to our 
problem since the degree of the underlying network is bounded, and so the iden- 
tity of the sender can be encoded in a constant number of header bits. 

Intermediate processors are assumed to be memoryless. Thus, processors can 
only send packets as a result of receiving a packet and must decide along which 
link(s) to forward the message and how to change the packet header, based only 
on the contents of the header. This is an appropriate model for a network with 
simultaneous traffic between many different pairs of processors, for example, the 
Internet, where no information concerning past traffic is stored. 

The links of the network are either alive or dead. At any time, a live link may 
become dead. Once a link becomes dead, it remains so. Processors do not know 
which subset of the links are alive. For simplicity, it is assumed that processors 
never fail. However, a dead processor can be simulated by considering each of 
its incident links to be dead. 

Live links deliver packets in a first in, first out manner. However, the time 
for a packet to traverse a link may differ at different times or for different links. 
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We assume that the time for a packet to traverse a link is finite, but unbounded. 
Edges which are dead can be thought of as having infinite delay. In this asyn- 
chronous model, a processor cannot distinguish between a dead link and a link 
which is just very slow. 



1.2 Summary of Results 

In this paper, we consider the problem of sending a single message from S to R. 
Our goal is to ensure that 

— as long as there is some simple S-R path of live links, at least one copy of 
the message gets sent from S to R, and 

— even if all links are alive, only a finite number of packets are generated. 

We say that a protocol which satisfies these requirements delivers a message from 
S to R with finite traffic. In this paper, we provide an algorithm that does this 
using 0(m(loglogn-|-logTO))-bit headers for any network G{m,n). For the case 
of G(3, n), this is improved in the full version of our paper to log 2 log 2 n + 0(1). 
In Sectional we demonstrate that for 0(3, n), log 2 log 2 n — O(logloglogn) bits 
are required. Using the following observation of Adler and Fich mm , this lower 
bound can be extended to G{m,n). 

Proposition 1. Suppose G' is a minor of G and S' and R' are the supernodes 
of G' containing S and R, respectively. Then any protocol for G that delivers 
a message from S to R with finite traffic gives a protocol for G' with the same 
packet headers that delivers a message from S' to R' with finite traffic. 

In particular, since for m > 3, G(rn,n) has 0(3, n) as a minor, log 2 log 2 n — 
O(logloglogn) bits are required for G{m,n). Thus, for any constant m > 3, we 
have optimal bounds to within a constant factor on the number of header bits 
that are necessary and sufficient to deliver a message from S to R with finite 
traffic in G(m,n). For the case of 0(3, n), our bounds are within an additive 
term of O(logloglogn) from optimal. 

Our upper bounds use a new technique to obtain an approximate count of 
how many nodes a message has visited, which is sufficient to guarantee that only 
a finite number of packets are generated. This technique may have applications to 
other networks. By Proposition ^ our upper bounds also provide upper bounds 
for any graphs that are minors of 0(m,n), for any constant m. 

The next section describes our protocol for G{m,n) for any constant m > 3. 
This is followed in Section □ by our lower bound for 0(3, n) and, hence, for 
0(m,n) with m > 3. 



2 A Protocol for G{m, n) 

In this section, we provide an upper bound on the header size required for sending 
a single message from S' to i? in 0(m,n). Since G(m,n) is a minor of G(rn,n') 
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for all n < n', by Proposition ^ it suffices to assume that n = 2^ + 1 for some 
positive integer h. 

We begin by giving a characterization of certain simple paths. The charac- 
terization will be used in Lemma nto parse simple paths. We will be considering 
sub-paths that go from left-to-right (from small ci to big C2) and also sub-paths 
that go from right-to-left (from big ci to small C2), but we will always work 
within a bounded region of rows consisting of row ri up to row r2 . 

Definition 1. For r\ < T2 and c\ yf C2, a (ci, C2, ri, T2) -bounded path is a 
simple path that starts in column ci, ends in column C2, and does not go through 
any node in a column less than min{ci,C2}, a column greater than max{ci,C2}, 
a row less than r\, or a row greater than r2- 

Note that every simple path from the first column of G{m, n) to the last 
column of G(m, n) is a (0, n — 1, 0, m — l)-bounded path. A (ci, C2, r, r)-bounded 
path is a simple path of horizontal edges. 

Definition 2. For r\ < V2 and ci yf C2, a (ci, C2, ri, r2) -bounded loop is a 
simple path that starts and ends in column ci, and does not go through any node 
in a column less than min{ci, C2}, a column greater than max{ci, C2}, a row less 
than ri, or a row greater than r2- 

We focus attention on bounded paths between columns which are consecutive 
multiples of some power of 2, i.e. from column c2^ to column c'2^, where c' = 
c± 1. 

Lemma 1. Let ci, C2, C3 be consecutive nonnegative integers, with C2 odd, and 
let k be a nonnegative integer. Then every ,c^ 2 ^ ,r\,r 2) -bounded path can be 

decomposed into a {ci 2 ^ , C22^ ,ri,r 2) -bounded path, followed by a series ofr2 — r\ 
or fewer {c22^ ,c\ 2 ^ ,ri,r2)- and {c22^ ,c^ 2 ^ ,ri,r 2) -bounded loops, followed by a 
{c 22^ ,c^ 2 ^ ,ri,r2)-bounded path. 

Proof Consider any (ci2'^, C32'', ri, r2)-bounded path. The portion of the path 
until a node in column C22^ is first encountered is the first subpath, the portion of 
the path after a node in column C22^ is last encountered is the last subpath, and 
the remainder of the path is the series of loops starting and ending in column 
C22^. The bound on the number of loops follows from the fact that the path 
is simple, so the first subpath and each of the loops end on different nodes in 
column C22^. □ 

This gives us a recursive decomposition of any simple path from the first 
column to the last column of G(m,n), where n is one more than a power of 
2. Specifically, such a (0,n — l,0,m — l)-bounded path consists of a (0, (n — 
l)/2,0,m — l)-bounded path, followed by a series of at most m — 1 different 
((n — l)/2,n — 1,0, m — 1) and {{n — l)/2,0,0,m — l)-bounded loops, followed 
by a ((n — 1) /2, n — 1, 0, m — l)-bounded path. Each of the bounded paths can 
then be similarly decomposed. Furthermore, we can also decompose the bounded 
loops. 
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Lemma 2. Let k, ri, T2, c\ and be nonnegative integers, where C\ and 
are eonseeutive, Ci is odd, and ri < r2- Then every {ci 2 ^ ,02'^^ ,ri,r2) -hounded 
loop ean be deeomposed into the prefix of a {c\ 2 ^ ,02^^ ,r\ + l,r2)-bounded path, 
followed by a downward edge, followed by the suffix of a {c22^ ,c\ 2 ^ ,r\,r2 — 1)- 
bounded path, or the prefix of a (ci2^, C22*, ri, r2 — l)-bounded path, followed by 
an upward edge, followed by the suffix of a (c22^, ci2^, ri + l,r2)-bounded path. 

Proof. Consider any (ci2^, C22^, ri, r2) bounded loop. Let c be the column far- 
thest from Ci2^ that this path reaches and let (r, c) be the first node in this path 
in column c. Let pi be the prefix of this path up to and including node (r, c). 
The next edge is vertical. Let p2 be the remainder of the bounded loop following 
that edge. 

Since the loop is a simple path, paths pi and p2 do not intersect. Thus, either 
Pi is completely above p2, so pi never uses row ri and P2 never uses row r2, or 
Pi is completely below p2 , so pi never uses row r2 and p 2 never uses row ri . □ 

We use this recursive decomposition of simple paths in our protocol. Instead 
of trying just the simple S-R paths in G{m, n), our protocol tries all S-R paths 
that can be recursively decomposed in this way. 

Our basic building block is a protocol that sends a packet from column ci to 
column C2, where Ci and C2 are consecutive multiples of some power of 2, using 
some set of r adjacent rows. The protocol does this by first sending the packet 
from column ci to the middle column (ci -I- C2)/2, recursively. Then it sends the 
packet looping around the middle column at most r — 1 times. Each loop consists 
of a first half and a second half, each of which uses at most r — 1 rows. Both of 
these subproblems are solved recursively. Finally, the protocol recursively sends 
the packet from the middle column to column C2. 

It follows by LemmasniandElthat, if there is a simple path of live edges from 
S to R, then our protocol finds it. Note that, at the lowest level of the recursion, 
a packet is always travelling in what is considered the forward direction (when 
the bounded path is from right to left, this will be in the backwards direction 
of the original problem, but still in the forward direction of the lowest level 
subproblem). Thus, the difficult part of this protocol is performing the bounded 
loops in such a way that the packet does not travel in an infinite loop. 

Let #2(0) = 00 and for every positive integer c, let #2(0) denote the largest 
power of two that divides c. Thus, if c can be expressed as Ci2^ for an odd 
number ci, then #2(c) = fc. In our protocol, the packet header is used to keep 
track of the column in which the current loop started and the distance to the 
other column boundary. If we naively stored these numbers, then f?(log n) header 
bits would be required. However, because our decomposition only uses bounded 
loops of the form (ci2^,(ci ± l)2^,ri,r2), where ci is odd, it is sufficient to 
keep track of k (i.e., #2(ci2^)). Note that k can be represented using only 
[log2log2(n — 1)] bits. Using the quantity k, a packet can tell when it reaches 
its boundary columns. In particular, while its current column c is between the 
boundaries, #2(0) < k but when c is at the boundaries #2(0) > k. 

When the algorithm is doing a bounded loop from column ci2*^ the following 
quantities are stored. 



762 



M. Adler et al. 



— power = # 2 (ci 2 *) (which is equal to fc), 

— minRow, the smallest row that can be used, 

— maxRow, the largest row that can be used, 

— loop Counter, the number of loops that have already been done around col- 
umn ci2^ in the current path, 

— loopHalf (0 if the current packet is in the first bounded path that forms this 
loop and -|-1 if it is in the second), 

— forward, the direction in which the packet is travelling on the current path 
(-1-1 if the packet is going from left to right and —1 it is going from right to 
left). 

Although our path decomposition has log 2 (n — 1) levels of recursion, at most 
m loops can be active at any one time. This follows from Lemma El since the 
number of allowed rows decreases by 1 for each active loop. We shall think 
of the bits in the packet header as a stack and, for each active loop, the above 
mentioned variables will be pushed onto the stack. Finally, we use two additional 
bits with each transmission to ensure that any node receiving a packet knows 
where that packet came from. In total, our protocol uses headers with at most 
0(m(loglogn -I- logm)) bits. 

At the start, S sends a packet to each node in column 0. The header of 
each packet contains the following information in its only stack entry: power 
= log 2 (n — 1), minRow = 0, maxRow = m — 1, forward = 1, loopHalf = 1, 
and loopCounter = 0. (To be consistent with other levels of recursion, we are 
thinking of the path from column 0 to column n — 1 as being the second half of 
a (n — 1, 0 , 0 , TO — l)-bounded loop.) 

We shall refer to the variable d = m — maxRow + minRow, which is equal 
to the recursion depth. We describe the actions of any node (r, c) that does not 
appear in the first or last column of G(m,n). The actions of the nodes in the 
first (or last) column are identical, except that they do not perform the specified 
forwarding of packets to the left (or right, respectively). In addition, if a node in 
the last column of G(m, n) ever receives a packet, it forwards that packet to R. 

Protocol DELIVER 

On receipt of a packet at node (r, c) with (power, minRow, maxRow, loopCounter, 
loopHalf, forward) at the top of its stack 

/* The default move is to forward a packet up, down, and in the current 
direction of travel. */ 

• If r < maxRow and the packet was not received from node (r -|- I,c), send 
the packet to node (r -|- I, c). 

• If r > minRow and the packet was not received from node (r — I,c), send 
the packet to node (r — I, c). 

• If power > # 2 (c), then send the packet to node (r, c + forward). 

/* In addition, we may choose to start a set of loops starting at the current 
column. This can happen only if r > minRow or r < maxRow, either of which 
implies that d < m. */ 
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• If power > # 2 (c) and r > minRow, then, for / = ±1, send the packet to 
node (r, c + /) with {^ 2 {c) , rninRow + l,maxRow,0,0, f) pushed onto its 
stack. 

• If power > # 2 (c) and r < maxRow, then, for / = ±1, send the packet to 
node (r, c + /) with {^ 2 {c),minRow,maxRow — 1,0,0,/) pushed onto its 
stack. 

/* If a loop is in its first half, it can switch to the second half at any step. */ 

• If loopHalf = 0, let minRow' denote the value of minRow at the previous 
level of recursion (i.e. in the record second from the top of the stack). 

If minRow = minRow' 

— then send the packet to node (r+1, c) with {power, minRow+l,maxRow+ 
IjloopCounter,!,— forward) replacing the top record on its stack. 

— else send the packet to node (r — 1, c) with {power, minRow— l,maxRow— 
l,loopCounter,l,— forward) replacing the top record on its stack. 

/* If a packet has returned to the column where it started its current set of 
loops, it has two options. */ 

• If # 2 (c) > power and loopHalf = 1 then 

/* Option 1: start the next loop in the set. Note that if the second half of 
the previous loop allows the use of rows r\ to r 2 , then the previous level of 
the recursion allows the use of either rows ri to r 2 + 1 or rows ri — 1 to r 2 . 
In the first case, the first half of the next loop can use either rows ri to T 2 
or rows ri + 1 to r 2 + 1. In the second case, the first half of the next loop 
can use either rows ri to r 2 or rows ri — 1 to T 2 — 1. */ 

— If loopCounter < maxRow— minRow — 1, then 

* For / = ±1, send the packet to node (r, c + /) with {power, minRow, 
maxRow, loopCounter+ 1, 0, f) replacing the top record on its stack. 

* Let minRow' and maxRow' denote the value of minRow and maxRow 
at the previous level of recursion (i.e. in the record second from the 
top of the stack) . 

* If minRow = minRow' and r > minRow then for / = ±1, send 
the packet to node (r, c+ /) with {power, minRow + I, maxRow + 1, 
loopCounter + 1, 0, /) replacing the top record on its stack. 

* If maxRow = maxRow' and r < maxRow then for / = ±1, send 
the packet to node (r, c+ /) with {power, minRow — I, maxRow — 1, 
loopCounter + I, 0, /) replacing the top record on its stack. 

/* Option 2: stop the current set of loops and return to the previous level 
of the recursion. */ 

— If d > 1, pop one record off the stack. Let forward' denote the value of 
forward at the new top level of the stack. Send the resulting packet to 
node (r, c + forward') . 



End of protocol. 
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Lemma 3. The header of any packet produced by the Protocol DELIVER has 
a length of at most m{ [log 2 log 2 (n — 1)J + 3[log2 m] + 3) + 2 hits. 

Proof. It is easily verified that the maximum depth of the recursion produced 
by Protocol DELIVER is m. For each such level, the variable power can be 
represented using [log 2 log 2 (n— 1)J + 1 bits, the variables maxRow, minRow, and 
loopCounter can be represented using |"log 2 m] bits, and forward and loopHalf 
can each be represented using a single bit. The final two bits come from the 
fact that each transmission informs the recipient of the direction from which the 
packet came. □ 



Lemma 4. Protocol DELIVER transmits only a finite number of packets. 

Proof. We provide a potential function for any packet in the system, such 
that there is a maximum value that <P can attain and, every time a packet is 
forwarded, the corresponding value of <P is increased by at least 1. (That is, each 
packet P has a potential exceeding the potential of the packet whose arrival 
caused P to be sent.) For each level of recursion i, 1 < i < m, we define three 
variables: Ici, Ihi, and disti. All of these variables are defined to be 0 if f > d, the 
current recursion depth. For i < d, Ici and Ihi are the loopCounter and loopHalf 
variables, respectively, for level i in the recursion. For i < d, the variable disti 
is the number of horizontal steps taken by the packet starting from the time 
that the forward variable at the Tth level of recursion was last set, counting 
only those steps that occurred when d = i. Note that a packet can only move 
horizontally in the direction specified by the forward variable, and thus all of 
these steps will be in the same direction. This means that disti < n. We also 
define the variable vert to be the number of steps taken in a vertical direction 
on the current column since last moving there from another column. 

The potential function <P that we define can be thought of as a (3m+ l)-digit 
mixed radix number, where for t € {!,..., m}, digit 3(t — 1) + 1 is Ict, digit 
3{t — 1) + 2 is Iht, and digit 3{t — 1) + 3 is distf. Digit 3m + 1 is vert. It is 
easily verified that when a packet is first sent, > Q. Also, by checking each 
of the possible actions of a node on the receipt of a packet, we can verify that 
every time a packet is forwarded, increases by at least 1. We also see that 
is bounded, since vert < m — 1 and, for any i, Ici < m, disti < n, and Ihi < 1- 
Since each packet receipt causes at most a constant number of new packets to be 
sent out, it follows that the total number of packets sent as a result of Protocol 
DELIVER is finite. □ 

It follows from the decomposition of simple S-R paths given by Lemmas [D 
and El that, if there is a simple path of live edges from S to R, then Protocol 
DELIVER finds it. We combine Lemmas El and 0 to get our main result. 

Theorem 1. Protocol DELIVER delivers a message from S to R with finite 
traffic using O (rn(loglog n + log m)) -bit headers . 
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3 A Lower Bound 

In this section, we prove that d?(log log n) header bits are necessary for communi- 
cating a single message in a 3 x n grid. First, we consider the graph G(3, n) with 
n = hi. The proof is similar in flavour to the lower bound for communicating a 
single message in a complete graph EEnn]. 

Our proof uses the following definitions. An S-path of extent j > 1 is a path 
from (0, c) to (2, c -I- j — 1), for some column c, where 0 < c < n — j. It consists 
of 



— A left-to-right path of length j — 1 along the bottom row from (0, c) to 
(0, c -I- j — 1), followed by 

— the vertical edge from (0,c-|-j — 1) to (l,c-|-j — 1), followed by 

— a right-to-left path of length j — 1 along the middle row from (1, c -I- j — 1) 
to (l,c), followed by 

— the vertical edge from (1, c) to (2, c), followed by 

— a left-to-right path of length j — 1 along the top row from (2, c) to (2, c+j — 1 ). 

Thus, an S-path of extent j contains 3(j — 1) horizontal edges and 2 vertical 
edges, for a total length of 3j — 1. Similarly, a Z-path of extent j is a simple path 
of total length 3j — 1 from (2, c) to (2, c + j — 1), to (1, c -I- j — 1), to (1, c), to 
(0,c), and finally to (0,c-|- j — 1). 

Our proof focusses attention on h particular simple S-R paths, defined as 
follows. For k = 1, ... ,h, let Pk consist of kl alternating S-paths and Z-paths, 
each of extent hl/kl, concatenated using single horizontal edges. Figure |2| shows 
paths Pi, P 2 , and P 3 for the case h = 3. 




Fig. 2. Paths Pi, P2, and P3 for ft = 3 

For 0 < i < n, let ii, ... ,ifi he such that i = J2k=i iun/kl where 0 < ik < k. 
In other words, (ii, • • • , ih) is the mixed radix representation of i, where the fc’th 
most significant digit is in base k. Note that ii always has value 0. For example, 
if n = 24 = 4! and i = 20, then A = 0, 12 = 1, *3 = 2, and u = 0. 

Proposition 2. Let 0 < i < j < n. Node (l,j) appears before node (1,*) in 
path Pk if and only if id = jd for d = 1 , . . . ,k. 

Proof. In every S-path or Z-path, the nodes in row I appear in order from 
largest numbered column to smallest numbered column. Since path Pk is the 
concatenation of S-paths and Z-paths, node (l,j) appears before node (1,*) if 
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and only if columns i and j are in the same S-path or Z-path. Since each S-path 
and Z-path comprising has extent n/k\, it follows that i and j are in the 
same S-path or Z-path if and only if \i/(n/k\)\ = [j’/(n/fc!)J, which is true if 
and only if for d = 1, . . . , fc. □ 

Consider any protocol for G(3, /i!) that delivers a message from S to R with 
finite traffic. Since node (1, c) is on path Pk, it receives at least one packet when 
only the links on the simple S-R path Pk are alive. Let Hk{c) denote the header 
of the last packet received by node (l,c) in this situation that causes a packet 
to be received by R. 

Lemma 5. Consider any protocol for G(3, h\) that delivers a message from S 
to R with finite traffic. Then, for all path indices 1 < j < k < h and all columns 
0 < c < P < h\ such that (ci, C 2 , . . . , Cj) = (c^, C 2 , . . . , c' ) and (ci, C 2 , . . . , Ck) 
(c'l, c' 2 , . . . , Pk)> either Hj{c) Hu{c) or Hj{c') yf Hk(c'). 

Proof. To obtain a contradiction, suppose that Hj{c) — Hk{c) and Hj{c') = 
idfc(c'), for some path indices 1 < j < k < h and some columns 0 < c < cf < hi 
such that (ci, C 2 , ■ ■ • , cy) = (di, c' 2 , . . . , c' ) and (ci, C 2 , ...,Ck) yf (c'l, c' 2 , . • . , c'^,). 
Then, by Proposition 0 node (l,c') appears before node (l,c) in path Pj but 
after node (l,c) in path Pk. 

Consider the situation when the links on both paths Pj and Pk are alive. 
The protocol forwards a packet along path Pk until a packet with header Hk{c') 
reaches node (l,c'). This causes a packet to be received by R. Since Hk{c') = 
Hj (c') and node (1, c') occurs before node (1, c) on path Pj, it also causes a packet 
with header Hj{c) to be received at node (l,c). Likewise, since Hj{c) — Hk{c) 
and node (l,c) occurs before node (l,c') on path Pk, this causes a packet with 
header Hk(c') to be received at node (l,c'), and we have an infinite loop. Each 
time such a packet goes through the loop, it produces a new packet that is sent 
to the destination R. This contradicts the finite-traffic assumption. □ 

Lemma 6. Consider any protocol for G(3, hi) that delivers a message from S 
to R with finite traffic. Then, for 1 < k < h, there exist nonnegative digits 
< 1, i 2 < 2, . . . , ffc < fc such that the k headers iLi(c), . . . ,Hk{c) are distinct 
for each column c with (ci,C 2 , ... , Ck) = (ii, 12 , • . ■ , ffe). 

Proof. To obtain a contradiction, suppose the lemma is false. Consider the small- 
est value oi k < h for which the lemma is false. Since there are no repetitions 
in a sequence of length one, fc > 1. Let i\ < 1,12 < 2, ... ,ik-i < k — 1 
be such that the k — 1 headers Hi{c), . . . , Hk-i{c) are distinct for each col- 
umn c with (ci, C 2 , . . . , Cfe_i) = (zi, 12 , . . . , ffc-i). Then, for each digit ik € 
{0, . . . , A: — 1}, there exists a path index j € {1, . . . , fc — 1} and a column c 
such that (ci,C 2 ,...,Cfc_i,Cfe) = (zi, Z 2 , . . . , Zfc_i, Zfc) and Hk{c) = Hj{c). 

Since there are k choices for ik and only k—1 choices for j, the pigeonhole 
principle implies that there exist distinct Zfe,zJ, G {0,...,/c— 1} which give rise to 
the same value of j and there exist columns c and c' such that (ci, C 2 , . . . , Ck-i) = 
(c'i,C 2 , . . ■ Ck = ik i'k = Cfc> Hk{c) = iLj(c), and Hk{c') = Hj{c'). But 

this contradicts Lemma 0 □ 
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Theorem 2. Any protocol for G(3, n) that delivers a message from S to R with 
finite traffic uses headers of length at least log 2 log 2 n — 0(log log log n) . 

Proof. Let h be the largest integer such that n > hi. Then n < (h+1)! < (h+1)^, 
so hlog 2 (/i+ 1) > log 2 n and h G l7(logn/loglogn). 

Consider any protocol for G(3, n) that uses headers of length L. Since G(3, hi) 
is a minor of G(3,n), it follows from Proposition ^ that there is a protocol for 
G(3, hi) using headers of length L. Hence, by LemmaEl L > log 2 h = log 2 log 2 n— 
O(logloglogn). □ 
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Abstract. Given a (possibly directed) network, the wavelength assign- 
ment problem is to minimize the number of wavelengths that must be 
assigned to communication paths so that paths sharing an edge are as- 
signed different wavelengths. Our generalization to multigraphs with k 
parallel edges for each link (fc fibres per link, with switches at nodes) 
may be of practical interest. While the wavelength assignment problem 
is NP-hard, even for a single fibre, and even in the case of simple net- 
work topologies such as rings and trees, the new model suggests many 
nice combinatorial problems, some of which we solve. For example, we 
show that for many network topologies, such as rings, stars, and spe- 
cific trees, the number of wavelengths needed in the fc-fibre model is 
less than 1/fc fraction of the number required for a single fibre. We also 
study the existence and behavior of a gap between the minimnm number 
of wavelengths and the natnral lower bonnd of network congestion, the 
maximum number of communication paths sharing an edge. For optical 
stars (any size) while there is a 3/2 gap in the single fibre model, we show 
that with 2 fibres the gap is 0, and present a polynomial time algorithm 
that finds an optimal assignment. In contrast, we show that there is no 
fixed constant k such that for every ring and every set of communication 
paths the gap can be eliminated. A similar statement holds for trees. 
However, for rings, the gap can be made arbitrarily small, given enough 
hbres. The gap can even be eliminated, if the length of communication 
paths is bounded by a constant. We show the existence of anomalies: 
increasing the number of fibres may increase the gap. 



1 Introduction 

We study a collection of interesting combinatorial questions, motivated by op- 
timization problems in the context of optical interconnection networks. For the 
purposes of this paper, an all-optical network consists of routing nodes intercon- 
nected by point-to-point fibre-optic links, which can support a certain number 
of wavelengths. Links are bidirectional. Each message travels through the net- 
work on a specific wavelength that, in this model, cannot be changed during the 
transmission. Two variants of this model of optical networks have been studied 
intensively (see |2j for an up-to-date survey): the directed model, in which two 
messages traveling on the same fibre-optic link in the same direction must have 
different wavelengths, and the undirected model, in which two messages passing 
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through the same fibre-optic link must have different wavelengths no matter in 
which direction they are traveling Q In what follows, a message will be called a 
request and a set of requests will be called an instanee. Given a network G and 
an instance I on G, the wavelength-routing problem consists of finding a routing 
seheme R for / and an assignment of a wavelength to each request of /, such 
that no two paths of R sharing an edge have the same wavelength, and such that 
the total number of wavelengths is minimized. In this paper we do not address 
the problem of obtaining a good routing scheme for a set of requests on a given 
network. We will assume that the routing scheme is given as part of the input, 
or that it is uniquely determined by the topology of the network we are consid- 
ering (as in the case of optical trees) . We will focus on the problem of assigning 
wavelengths to a given set of communication paths. The problem of finding an 
optimal wavelength assignment is NP-hard even if we restrict our attention to 
very simple network families such as rings or trees iPO]. In some cases, there 
exist polynomial time greedy algorithms (see for example jl 619] 1 which provide 
approximate solutions for this problem in terms of the network eongestion (the 
maximum number of connection paths which share a fibre-optic link) which, in 
turn, is a lower bound on the optimal number of wavelengths. 

We define and analyze a new optical network model: Each point-to-point 
fibre-optic link consists of k distinct optical fibres (the same k for each link). 
This assumption is very natural 0, and suggests many interesting algorithmic 
and combinatorial questions. In this new model, each time we send a message 
through a link we need to specify which fibre we want to use. Two paths sharing 
a link can be given the same wavelength if they pass through distinct fibres. 

We ask the following basie question: can we reduce the number of wavelengths 
by a factor strictly larger than k using k fibres per link ? We prove that for a 
number of network topologies of practical interest, this question has an affir- 
mative answer. We also identify many challenging and (in our view) interesting 
problems, and provide solutions to some of them. 

The main results of this paper are: 

- We show (Thm. EJ that for any k,m > 1 there exists a network G, an 
instance I on G, and a routing R for I, such that the minimal number of wave- 
lengths needed using k fibres per link is at least m, while the number of wave- 
lengths needed using k-\-l fibres per link is 1. Note that this gives an affirmative 
answer our basie question for instance /. 

- For optical star networks we are able to show significant improvement by 
using multiple fibres. In the undirected single fibre model every instance can 
be routed using a number of wavelengths equal to 3/2 times the congestion 
of the network m and this is the best ratio achievable. In contrast using 2 

^ Brief justification for the models: physical links (fibres) are undirected. Current 
repeaters and routers aren’t. 

^ We do not discuss practicality: it is easy to justify wiring that uses multiple fibres, 
and it is possible to build appropriate switches. Whether such switches can be made 
economical is not clear. This may also depend on the benefits of multiple fibres. We 
hope that papers like ours will eventually determine the amount of these benefits. 



770 L. Margara and J. Simon 



fibres per link it is possible to route any set of requests on an undirected star 
network optimally (i.e. using a number of wavelengths equal to the congestion 
of the network, where the congestion of the network using k fibres is defined as 
the largest number of paths through any fibre-equivalently, this is the value of 
the congestion of the single fibre version of the same network, divided by k). 
Moreover, we give a polynomial time algorithm (Thm. 0 that assigns paths to 
fibres. 

- In the case of optical tree networks we prove that there is no single constant 
k so that for every tree all instances can be routed using a number of wavelengths 
equal to the congestion of the busiest link. This is true both for undirected 
(Thm.0 and directed ('Thm. fTDIl networks. The theorem holds even if we restrict 
the underlying graph to be the family of undirected trees of height 2. Note that 
this does not mean that it is impossible to eliminate the gap for a fixed graph. 
In fact, we prove that for binary trees of height 2, 4 fibres are enough for closing 
the gap between number of wavelengths and network congestion (Thm.^. 

- For ring networks we give a polynomial time algorithm (Thm. 1121 which 
takes as input an optical ring G with n nodes (either directed or undirected), 
an instance I on G, and a routing scheme R for / and computes a wavelength 
assignment for R whose cardinality is at most 1 + 1/fc times larger than the 
congestion caused by R on G, where k is the number of fibres per link. Note that 
using one fibre per link the best ratio achievable between number of wavelengths 
and network congestion is 2 ^E|. We also prove (Thm. E|) that for every fc > 1 
there exist an optical ring G with n nodes and k fibres per link, an instance I 
on G, and a routing scheme R for I such that the cardinality of any wavelength 
assignment for R is 2k /{2k — 1) = 1 + l/{2k — 1) times larger than the network 
congestion caused by R on G. Finally, we show iThm. lT^ that if all the requests 
have a length uniformly bounded by an arbitrarily chosen constant c, then there 
exists a value kc (which depends only on c) such that using kc fibres per link 
it is possible to close the gap between the number of wavelengths used and the 
network congestion for ring of any size. 

- We show that, perhaps surprisingly, adding fibres may increase the gap 
between load and number of wavelengths (Thm.^. 

The following question remains open: Is it true that given a fixed network G 
there exists a number fc > I such that using fc fibres per link it is possible to 
route all the possible instances on G using a number of wavelengths equal to the 
congestion of the network ? 

Due to limited space some of the proofs are omitted, or only sketched. 

2 Basic Definitions 

The standard optical model. We represent an all-optical network as a graph 
G = {V, E), where V = {1, . . . , n} represents the set of nodes of the network and 
E C {{i,j) I i,j S V} represents the set of node-to-node connections available in 
the network. A request is an ordered pair of vertices (s, d), s,d € V, corresponding 
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to a message to be sent from node s to node d. An instance / is a collection 
of requests. Note that a given request can appear more than once in the same 
instance. Let / = {(si, di), . . . , (sm, dm)} be an instance on G. A routing scheme 
R = {pi, . . . ,Pm} for / is a set of simple directed paths on G. Each path pi is 
a sequence {vi, . ■ of distinct vertices of V such that vi = Si and Vk = 
di- We say that a path p = (ui, . . . ,Vk) contains the edge (i,j) if there exists 
h, 1 < h < k — 1, such that Vh = i and Vh+i = j- A legal wavelength assignment 
W of cardinality m for a routing scheme i? is a map from R to [1, . . . , m] such 
that if two elements p,q & R share an edge then W{p) yf W{q), i.e., they are 
given distinct wavelengths. This defines two variant models, depending on the 
interpretation of “. . . sharing an edge. . . 

- Directed model. Two paths p and q share the edge {i,j) if both p and q 
contain the edge (t, j). 

- Unirected model. Two paths p and q share the edge {i,j) if both p and q 
contain at least one edge in the set {(t, j), (j, *)}• 

The wavelength-routing problem can be formulated as follows: Given a graph G 
and an instance / on G, find a routing scheme R for / and a legal wavelength 
assignment W for R such that the cardinality of W is the minimum among all 
possible routing schemes R for I and legal wavelength assignments W for R. 
The new optical model. A legal k-wavelength assignment W of cardinality m 
for a routing scheme i? is a map from R to [1, . . . , to] such that if A: -I- 1 elements 
Pi,...,Pfc+i € R share an edge then there exist 1 < i,j < fc -I- 1 such that 
W{p,) ^ W{p,). 

Our definition is equivalent to considering a multigraph obtained from G by 
replacing every edge by a set of k parallel edges. Note that we consider both 
directed and undirected legal /c-wavelength assignments. In the directed model 
k paths pi, . . . ,pk G R share the edge (i,j) iff every pi, 1 < i < k, contains the 
edge (i,j), while in the undirected model k paths pi, . ■ ■ ,Pk G R share the edge 
(i,j) if every pi, I < i < k, contains at least one edge in the set (j, *)}• 

Number of wavelengths and network congestion. Let / be an instance on 
a graph G = (V, E). Let R = {pi, . . . ,Pm} be a routing scheme for I. We define 
the conflict graph Gc = (Wj Ec) for R, and G as having vertices Vc = R and 
edges Ec = {{pi,pj) \ Pi and pj share an edge of G}. We denote by W{R, G, k) 
the cardinality of the best possible legal fc-wavelength assignment for R. It is easy 
to verify that W{R, G, I) is equal to the chromatic number of Gc- Let L{R, G, a), 
the load of a be the maximum number of paths of R sharing the edge a. Let 
L(i?, G) be the maximum of L(i?, G, a) over all the edges a of G. It is easy to 
verify that W{R, G,k) > |" ^L{R, G)] . In the I-fibre case (fc=I) L{R, G) is called 
the congestion of the network. Similarly, we will call the quantity G)] 

the /c-congestion (or, when k is clear from the context) simply, congestion. Fix a 
graph G. Let T = be a sequence of instances on G, and let S = {Ri}i^N 

be a sequence of routing schemes. We say that S produces a /c-gap r > 1 on G 
if and only if 



for every z > 1 : 



W{R,,G,k) ^ 
\lL{R,,G)) 
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We denote by Gap{S, G, k) the maximum fc-gap that S produces on G. We define 
the fc-gap of G, denoted by Gap{G,k), as the supremum of Gap{S,G,k) taken 
over all possible sequences S. Again, we will omit k when its value is clear from 
the context. We define N{G) as the minimum k such that Gap{G, k) = 1, if such 
a k exists. 

Acyclic networks. Acyclic networks are modeled by acyclic graphs. In acyclic 
graphs an instance / uniquely determines its routing scheme, so we omit R: I 
will denote both a set of requests and the associated set of paths. 

3 More Fibres Can Help 

In this section we prove that there exist a network G, an instance I on G, and a 
routing scheme R for / such that the ratio between the number of wavelengths 
needed for R using k fibres and the number of wavelengths needed for R using 
k + I fibres can be made arbitrarily large. We start with two observations. 

Observation 1 Let I be an instance on some graph G = (V,E), and let R be 
a routing scheme for I . If the conflict graph Gc associated to R and G contains 
no triangles (cliques with 3 nodes) then W(i?, G, 2) = 1. 



Observation 2 Let Gc be an arbitrary graph. Then we can construct, in time 
polynomial in the size of Gc a graph G, an instance I on G, and a routing scheme 
R for I on G, such that Gc is the conflict graph of (R, G). 

We use these observations to prove the following result. 

Theorem 3. Given any m > 1 it is possible to construct a graph G = (V,E), 
an instance I on G, and a routing scheme R for I such that: W{R,G,1) > m 
and W{R,G,2) = 1. 

Sketch of proof. Let Gc be any graph with chromatic number at least m and 
with maximum clique at most 2 (for the construction of such a graph see for 
example HD) From Gc, we construct a graph G, an instance I on G, and a 
routing scheme R for / on G such that Gc is the conflict graph of (i?,G). Then 
we conclude that W{R, G, 1) > m and W(R, G, 2) = 1. □ 

We generalize Thm. 0as follows. 

Theorem 4. Given any m > 1 and k > 2 it is possible to construct a graph G = 
(V, E), an instance I on G, and a routing scheme R for I such that W{R, G,k — 
1) > m and W{R, G, k) = 1. 

Sketch of proof. It is possible to construct a network G an instance I on G, 
and a routing scheme R for / such that: for every subset S' C i? of fc + 1 paths 
there is an edge e in G such that all the paths of S share the edge e, and 
L{R,G) = fc + 1. As a consequence we have that W{R,G,k) > \R\/k, while 
W{R,G,k+l) = l. □ 
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4 Star Networks 

An n-star is a graph G = {V,E) such that V = {c, a;i, . . . , x„} and E = 
{(c, Xi), i = The node c is the center of the star, while the nodes 

Xi are the leaves of the star. In the case of star networks using the single fibre 
directed model it is possible to efficiently (in polynomial time) route all instances 
with a number of wavelengths equal to the network congestion (this problem is 
equivalent to computing the chromatic index of bipartite graphs US!). This is no 
longer true in the undirected model. The best ratio achievable in the undirected 
model between number of wavelengths and network congestion is 3/2 fS|- Also, 
computing the optimal wavelength assignment in this model is an NP-hard prob- 
lem (it is equivalent to the edge-coloring problem of multigraphs, which is an 
NP-hard problem |E|). In the next theorem we show a rather surprising result: 
using 2 fibres it is always (independently of the size of the star) possible to find 
a wavelength assignment whose cardinality is equal to the network congestion. 
Moreover, this can be done in polynomial time. 

Theorem 5. Let G be any n-star. Then, in the undirected model, N(G) = 2. 

Proof. Let G be an n-star with vertex set V = {c, Xi, . . . , x„}. Let I be any set 
of paths on G. We have L(I, G) = m.a.x{deg{xi) | z = 1, . . . , n}, where deg(xi) is 
the number of paths touching the node x^. Without loss of generality we assume 
L{I, G) even. We prove that there exists a legal 2-wavelength assignment for G 
of cardinality L{I, G)I2. We first add to I as many new paths as we can without 
increasing L{I,G). At the end of this procedure each node has degree L{I,G) 
except for at most one node. For assume this is not the case. Then there are two 
distinct nodes Xi and Xj of less than maximum degree. Adding the path (xi,Xj) 
to / does not increase L{I,G), contradicting the maximality of I. Assume that 
Xi is the only node with degree d < L{I,G). Since n — 1 nodes have degree 
L{I, G) and Xi has degree d and since each path has 2 endpoints we know that 
(n — l)L{I, G) +dis even. Since L{I, G) is even we conclude that also d is even. 
We now add two new nodes x„+i and x „+2 to G. Then we add to I: 

- {L{I, G) — d)/2 paths from Xi to x„+i, 

- {L{I, G) — d)/2 paths from Xi to x„+ 2 , and 

- {L{I, G) + d)/2 paths from x„+i to x„+ 2 . 

Now, all the nodes of G have even degree L{I, G). Consider a new graph G' = 
{V',E') where 

V' = V\ {c} and E' = {(xi, Xj) \ the path from Xt to Xj belongs to /} . 

G' is L{I, G)-regular (each node has degree L{I, G)) with L{I, G) even, so E' can 
be partitioned m into L{I, G)/2 subsets E[,. . . , E'^^j such that each graph 
G' = (W, A') is 2-regular. Let A C / be the set of paths corresponding to Ef It is 
easy to verify that L{Ii, G) = 2 and therefore W{Ii, G, 2) = 1. This implies that 
there exists a legal 2-wavelength assignment W for (/, G) with \W\ = L{I,G)/2 
as claimed. 
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Since there is a polynomial time algorithm to partition a graph into 2-factors 
(2-regular spanning subgraphs) [P, and the other procedures used in the con- 
structive proof above are clearly in polynomial time we have: 

Corollary 6. Let G be any n-star and I be any instance on G. An optimal 
wavelength assignment for (I, G) in the 2 -fibres undirected model can be computed 
in polynomial time. 



5 Tree Networks 



We first consider the well studied i7-graph network the complete binary tree 
of height 2 in the undirected model. We show that the i/-graph is the simplest 
network that needs more than 2 fibres per link to close the gap between number 
of wavelengths and network congestion: 4 fibres are necessary and sufficient. We 
also show that the i?-graph is a simple example of a network that exhibits a 
monotonicity anomaly, using more fibres may increase the gap. 

Theorem 7 . Let G be an H -graph. Ln the undirected model we have: N(G) = 4, 
and Gap{G, 5) > 1. 

The proof, a small explicit instance, will be presented in the full journal version. 

If only leaf-to-leaf communication paths are allowed then it is possible to 
prove that N{G) = 2 in the undirected model and N{G) = 1 in the directed 
model. We now prove that, in a directed i7-graph, 2 fibres per link are not enough 
for closing the gap between number of wavelengths and network congestion. The 
problem of determining the minimum number of fibres necessary for closing the 
above mentioned gap in a directed iJ-graph is still open. 

Theorem 8. Let G be a directed H -graph. Then N{G) > 3. 

Proof. We construct a sequence S of instances such that Gap{S, G, 2 ) > |. Let x 
be the root of G. Let yi and j/2 be the left and the right child of x. Let z\,Z 2 ,Z 3 , 
and Z4 be the leave s of G listed from left to right. Let L be defined as follows: 3 
copies of path (zi, Z2), 2 copies of path {z2, zf), 3 copies of path (23, Z4), 2 copies 
of path (z4,zs), 2 copies of path (z2,Z3), 1 copy of path (03,02), 1 copy of path 
(^1,2/2), 1 copy of path (yi,04), 1 copy of path (y2,zi), 1 copy of path (04,^1), 
and 1 copy of path (04, 01). 

We have |/| = 18 and L{L, G) = 4. Let Ij be a set of paths consisting of j 
copies of /. Let S = {Ij}j^N. It is easy to verify that \Lj\ = |L(/j, G). Since the 
largest subset Q of Lj such that W{Q,G, 2 ) = 1 has cardinality 8 (this can be 
proven by exhaustive analysis), we have 



W(/„G,2) > M 



9 

l6 



L(/„G) 



or, equivalently. 



W(/„G,2) ^ 9 
iL(/,-,G) -8' 



By definition of Gap we conclude that Gap{S,G, 2 ) > |, so Gap{G, 2 ) > |. 



For general optical trees of height 2 we have the following result. 
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Theorem 9. For every k > 1 there exists an undirected optical tree T of height 
2 such that Gap{T, fc) > 1 -I- ^ . 

Proof. Let fc > 1 be an integer. We define an undirected optical tree T of height 
2 as follows. X is the root of T with left child L and right child R\ L has 2k 
children Iq, - ■ ■ , hk-i and R has 2k children ro, . . . , r 2 k-i- We define on T a set P 
of undirected paths as follows. P consists of 2fc— 1 copies of left— short — paths, 
2k — 1 copies of right — short — paths, and 1 copy of long — paths, where 

left - short - paths = {(Zq, ^i), (h, h), ■■■, {hk- 2 , hk-i)} 
right - short - paths = {(ro, ri), (r 2 , rs), . . . , (r 2 k- 2 ,r 2 k-i)} 

long — paths = {{k, ri) \ i = 2h, h = 0, 1, . . . , fc — 1} U 

{ j '^(i+2) mod 2 k) I ^ 2h h 0,l,...,/c 

The cardinality of P is and L{P, T) = 2k. Let Ij be the set of paths on 
T obtained by taking j copies of P. Trivially, the cardinality of Ij is Ajk^ 
and L{Ij,T) = 2jk. Let S = {Ii}i^]y. Let P' be any subset of Ij such that 
W{P' ,T,k) = 1. We claim that the cardinality of P' is at most 2k^ and that if 
P' contains at least one long path, then the cardinality of P' is at most 2k^ — 1. 
To prove our claim we proceed as follows. Let P' be the union of k copies of 
the paths in the set left — short — paths and k copies of the paths in the set 
right — short — paths. It is not difficult to prove that W{P\ T,k) = 1 and that 
the cardinality of P' is 2fc^. If we insert a long path in P' , in order to maintain 
W{P' ,T,k) = 1, we are forced to remove 2 short paths from P' decreasing its 
cardinality by one. We can insert in P' at most fc — 1 other long paths without 
violating the property W{P' , T, k) = 1. Each insertion of a long path in P' forces 
us to remove at least one short path. This completes the proof of our claim. 

Ij contains 2jk long paths. Since W{P' ,T,k) = 1, P' contains at most k 
long paths. This means that for assigning wavelengths to all long paths we need 
at least 2j distinct new wavelengths. We call them long wavelengths. Each long 
wavelength can be given to at most 2fc^ — 1 paths in Ij. This means that to 
complete the wavelength assignment of Ij we still have to assign wavelengths to 

at least — 2j{2k'^ — 1) paths and then we need at least ^ 

new wavelengths. So W{Ij,T,k) >2j+-^. Since ^L{Ij,T) = 2j, we conclude 
that Gap{S, T,k) > 1 + ^ as claimed. 

A similar result can be proven in the directed model. 

Theorem 10. For every k>l there exists a directed optical tree T of height 2 
such that Gap{T, A:) > 1 -I- ^ . 

We omit the proof of this theorem since it is similar to the proof of Theorem 0 

6 Ring Networks 

An n-ring is a graph G = {V,E) with 

V = {xo, . . . ,x„_i} and E = {{xi,Xi+i), i = 0, . . . ,n - 2} U {{xn-i,xo)}. 
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For ring networks, if we are not interested in routing schemes, there is no differ- 
ence between the directed and the undirected model. In fact, once we are given 
a routing scheme R for an instance / in a directed optical ring, the set of paths 
of R can be partitioned into two disjoint sets of paths, C, paths routed in the 
clockwise direction and CC, routed in the opposite direction. Since there are 
no conflicts among paths belonging to C and CC (they use different directions 
on every edge), the original problem is equivalent to the two undirected wave- 
length assignment problems given by the set of requests in C and CC. For this 
reason, we will consider only the problem of assigning wavelengths to a set I of 
undirected paths on a ring. 

In the proof of next theorem we give an algorithm which takes as input an optical 
ring G of any size and an instance I on G and produces a 2-legal wavelength 
assignment whose cardinality is at most 3/2 larger than the network congestion 
caused by I (plus 3). 

Theorem 11. There exists a polynomial time algorithm which, given any set I 
of paths on any n-ring G, produces a legal 2-wavelength assignment W such that 

|iy| < + 3, 

Sketch of proof. Let G=(V,E) be an n-ring, with vertex set V= {xq, . . . , a^-i} 
and edge set E = {{xi, Xj+i), i = 0, . . . , n — 2} U {(x„_i, a:o)}. Let / be any set 
of paths on G. Without loss of generality, we assume that: 

- each path of / starting at Xi and ending at Xj (denoted by (xi,Xj), passes 
through nodes Xj+i, Xi+ 2 , . . . , xj-i, where sums are taken modulo n, 

- each edge e £ E has full load L = L{I,G), i.e., exactly L{I,G) paths pass 
through e, and 

- there are no paths starting from or ending at xg- 

We say that a path (xi,Xj) is a regular path if i < j and is a crossing path 
if i > j. Let cpi, . . . ,cpL be the crossing paths for (/, G). Our algorithm com- 
putes a 2-legal wavelength assignment W as follows. We first assume that every 
crossing path cpi = (si,di) can be given two different wavelengths at the same 
time. The first wavelength is associated to the segment Si,. . . ,xg and the second 
wavelength is associated to the segment xg,. . . ,di. Taking advantage of this (il- 
legal) assumption, it is possible to find a legal 1-wavelength assignment W such 
that \ W'\ = L (this problem is equivalent to the wavelength assignment problem 
on the line) . W applied to a regular path returns a single wavelength, while W 
applied to a crossing path returns a pair of wavelengths associated to the first 
and to the second segment of the path. We say that a subset S = {cpi^ ,■■■ , cpi,, } 
of crossing paths is cyclic according to W iff : 

- for every j = 1, . . . ,h — 1, the wavelength associated to the second segment of 
cpi- is equal to the wavelength associated to the first segment of cpi.^., and 

- the wavelength associated to the second segment of cpi,, is equal to the wave- 
length associated to the first segment of cpi^ . 

We now partition the set of crossing paths into cyclic subsets S\, . . . , Sm- 
Note that this decomposition is unique up to a relabeling of the cyclic subsets. 
Consider now a generic cyclic set Si having cardinality 4/i for some h>l. Let IF/ 
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be the set consisting of the Ah distinct wavelengths associated to the crossing 
paths belonging to Si. Let C / be the set of paths having a wavelength 
belonging to VL/. It is easy to verify that L{Ii, G) = Ah and that L{I \ li, G) = 
L — Ah. We claim that there exists a wavelength assignment W for the paths in 
li with cardinality 3h. To prove this fact we proceed as follows. 

Step 1. To each crossing path j = 0, . . . , 2/i — 1 of Si we assign a new 

wavelength w. We assign w also to each path in li whose wavelength according 
to W is one of the two wavelengths associated to the two segments of cpi^^^^ . 
Globally, in Step 1, we use 2h new wavelengths. 

Step 2. To each pair of crossing paths cpi^^^^ and cpi^^^^, j — 0, ... ,h — 1 oi Si 
we assign a new wavelength. Globally, in Step 2, we use h new wavelengths. 
The wavelength assignment W defined in Step 1 and 2 has the following proper- 
ties: \W\ = 3h, W is a 2-legal assignment for (7^, G), L{I \ 7^, G) = L — Ah, and 
|W| = (3/2)(L(7„G)/2). 

Assume for a moment that all SiS have cardinalities multiple of 4. In this 
easy case we just have to repeat Step 1 and 2 until all cyclic sets have been 
considered. Unfortunately, in general not all SiS have cardinality Ah for some 
/i > 1. If the cardinality of \Si\ = Ah + d, d = 1,2,3, we need to construct a 
wavelength assignment W in a slightly more complicated way. We now describe 
the basic idea for dealing with the case \Si\ = Ah + 1. The other 2 cases can be 
treated analogously and the complete proof will be given in the full paper. 

If |5'i| = 4/i -I- 1, we distinguish two cases. If there exists another Sj such that 
|S'jj = Ah+ 1, we construct W for SiUSj using 6h wavelengths using a technique 
which is similar to that used in Step 1 and 2. If Si is the unique cyclic set with 
cardinality Ah+1 then we construct W for Si alone using 3h+l wavelengths. This 
is one of the three cases in which we are forced to use one additional wavelength. 
This completes the sketch of the proof. □ 

The result given in Thm. CH for the 2 fibres model can be generalized to the k 
fibres model using a similar proof technique. 

Theorem 12. There exists a polynomial time algorithm which, given any in- 
stance I on any n-ring G, produces a legal k-wavelength assignment W such 
that \W\ < (l + |:) + Cfc, where Ck depends on k but is independent of I 

and n (and then L(I,G)). 

In the next theorem we prove that if we consider optical rings of any size with 
k fibres per link, the ratio between wavelengths and network congestion can be 
made arbitrarily close to 2 k-i • ^ consequence, we have that, no matter how 

many fibres we use, it is impossible to close the gap between wavelengths and 
network congestion (as we did in the case of optical stars) for the entire family 
of optical rings at the same time. 

Theorem 13. Let be any algorithm such that for every n-ring G and every 
instance I on G finds a legal k-wavelength assignment W for (7, G) such that 
\W\ < . Then a > 



2k-l ■ 
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Sketch of proof. Let G„ = (V,E) be an n-ring with n even, where 
V = {xo, . . and E = {{xi,Xi+i), i = 0, . . . , n - 2} U {(a;„_i, Xq)}. 

Let In = PiU P 2 where 

Pi = {(xi,Xi+n/2) I 0 < i < n/2 - 1 } ,P2 = {{Xi+n/ 2 ,Xi+i) | 0 < i < n/2 - 2} . 

It is easy to see that L{Im Gn) = n/2 and that |/„| = n — 1, while it takes a little 
effort to prove that the largest subset Q of In such that W{Q,Gn,k) = 1 has 
cardinality 2fc — 1. Asa consequence of this consideration we have W {In, Gn, k) > 
and then 

W{In,Gn,k) ^ 2k 2k 

“2fc-l n{2k-l)' ^ 

If a < 2 !^ using Equation [0 with n large enough we get a contradiction. □ 

We end this section by considering a slightly different version of the standard 
wavelength assignment problem for optical rings in which the maximum length 
of input requests is at most c < n. In this framework it is possible to prove the 
following result. 

Theorem 14. If we consider requests of length at most c, then there exists kc 
(which depends only on c) such that for every n-ring G we have Gap{G, kf) = 1. 

Sketch of proof. Let I be any instance on any n-ring G. As usual we assume 
that for every edge e of G we have L{I,G,e) = L{I,G). First we note that the 
number of distinct crossing paths (defined in the proof of Thm. [TTIl of / is at 
most c. As a consequence of this fact, we claim that it is possible to find a set 
of requests /' C / such that 

L(/',G)<c and L{I \ I' ,G) = L{I,G) - L{I' ,G). (2) 

The set I' can be constructed as follows. We start from I' = {cp} where cp is 
any crossing path. Then we move clockwise on the ring adding to I' a sequence 
of adjacent paths until we reach a new crossing path cp' whose first endpoint is 
equal to the first endpoint of another crossing path cp" G I'. Note that this must 
happen within at most c complete rounds since the number of distinct crossing 
paths is at most c. At this point we do not add cp' to I' but we remove from /' 
all the paths (possibly none) inserted before cp". It takes a little effort to prove 
that /' satisfies properties 0. We repeat this procedure until all paths of I have 
been considered getting a partition of I into m subsets 1 < i < m, such that 

m 

m,G)<c and ^L(/',G) = L(/,G). (3) 

i=l 

It remains to be proven that there exists kc such that using kc fibres per link 
Gap{S,G,kc) = 1 for every sequence of instances S. It can be proven (due to 
limited space we omit this part of the proof) that chosing kc = c! no sequence 
of instances can produce a gap bigger than 1. □ 
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Abstract. Markov-reward models, as extensions of continuous-time 
Markov chains, have received increased attention for the specification 
and evaluation of performance and dependability properties of systems. 
Until now, however, the specification of reward-based performance and 
dependability measures has been done manually and informally. In this 
paper, we change this undesirable situation by the introduction of a 
continuous-time, reward-based stochastic logic. We argue that this logic 
is adequate for expressing performability measures of a large variety. We 
isolate two important sub-logics, the logic CSL m, and the novel logic 
CRL that allows one to express reward-based properties. These log- 
ics turn out to be complementary, which is formally established in our 
main duality theorem. This result implies that reward-based properties 
expressed in CRL for a particular Markov reward model can be inter- 
preted as CSL properties over a derived continuous-time Markov chain, 
so that model checking procedures for CSL can be employed. 



1 Introduction 

With the advent of fault-tolerant and distributed computer and communication 
systems, the classical separation between performance evaluation and depend- 
ability (i.e., reliability, availability and timeliness) evaluation does not make 
sense anymore. Instead, the combined performance and dependability of a sys- 
tem is of critical importance. This observation led to development of the per- 
formability evaluation framework [i2ini. This framework allows one to specify 
models that include both performance-related and dependability-related events 
in a natural way. Furthermore, the choice of Markov-reward models (MRMs) 
HH as mathematical basis allows one to specify a wide variety of measures of 
interest, albeit at times in a slightly cumbersome way. An MRM is a continuous- 
time Markov chain (CTMC) augmented with a reward structure assigning a 
real-valued reward to each state in the model. Such reward can be interpreted as 
bonus, gain, or dually, as cost. Typical measures of interest express the amount 
of gain accumulated by the system, over a finite or infinite time-horizon. 
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Given the fact that the model is stochastic, the measures of interest are 
stochastic variables. MRMs have shown to pair a reasonable modelling flexibil- 
ity and expressiveness with manageable computational expenses for the model 
evaluation. To increase the modelling flexibility, a number of application-oriented 
model specification techniques and supporting tools have been developed 0. 

The specification of the measure-of-interest for a given MRM can not always 
be done conveniently, nor can all possible measures-of-interest be expressed con- 
veniently. In particular, until recently it has not been possible to directly express 
measures where state sequences or paths matter, nor to accumulate rewards only 
in certain subsets of states, if the rewards outside these subsets are non-zero. 
Such measures are then either “specified” informally, with all its negative im- 
plications, or require a manual tailoring of the model so as to address the right 
subsets of states. An example of a measure that is very difficult to specify di- 
rectly is the expected amount of gain obtained from the system until a particular 
state is reached, provided that all paths to that state obey certain properties. 

Recently, Obal and Sanders have proposed a technique to specify so-called 
path-based reward variables PI by which the specification of measures over state 
sequences becomes more convenient, because it avoids the manual tailoring of 
the model. In the context of the stochastic process algebra PEPA, Clark et al. 
recently proposed the use of a probabilistic modal logic to ease the specification 
of reward structures of MRM 0, as opposed to the specification of reward-based 
measures, as we do. 

In 1^ we proposed to specify measures of interest for CTMCs in the logic 
CSL (Continuous Stochastic Logic), a superset of the (equally named) logic 
introduced by Aziz et al. [3- CSL includes a timed CTL-like time-bounded 
until operator, and a steady-state operator. Using this logic, very complex mea- 
sures can be expressed easily; model-checking algorithms for CSL have been 
proposed Idl2l (and implemented m)- Notice however, that CSL is interpreted 
over CTMCs only, and is hence not able to address reward-based measures. The 
current paper extends this work, in that Markov- reward models are evaluated, 
i.e., CTMCs augmented with a reward structure. 

In this paper, we introduce a novel continuous-time, stochastic reward-based 
logic CSRL, that is adequate for expressing performability measures of a large 
variety. It includes next and until operators, that are equipped with time- 
interval- as well as reward-interval-bounds. We present syntax and formal se- 
mantics of the logic, and isolate two important sub-logics: the logic CSL, and 
the logic CRL (Continuous Reward Logic) that allows one to express time- 
independent reward-based properties. These logics turn out to be complemen- 
tary, which is formally established in a main duality theorem, showing that time- 
and reward-intervals are interchangeable. More precisely, we show that for each 
MRM A4 and formula <P the set of states satisfying equals the set of states of 
a derived MRM A4~^ satisfying formula where the latter is obtained from 
^ by simply swapping time- and reward-intervals. The transformation of A4 is 
inspired by 0. The fixpoint characterisations for the CRL path operators (in- 
terpreted over an MRM) reduce to the characterisations that are used for model 
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checking CSL (over a CTMC). As a consequence of the duality result, the model 
checking problem for CRL is reducible to the model checking problem for CSL 
and hence solvable with existing techniques for CSL. 

The paper is organised as follows. Section |21 introduces MRMs and typical 
measures of interest for them. In Section 0 the logic CSRL and its sub-logics 
are defined, whereas Section 2] presents the main duality theorem. Section 0 
discusses its consequences for model checking and highlights that most reward- 
based performability measures having appeared in the literature can be expressed 
as simple formulas of (a minor extension of) the logic. Section El concludes the 
paper. 

2 Markov Reward Models 

In this section we introduce the basic concepts of MRMs El. We slightly depart 
from the standard notation for MRMs (and CTMCs) and consider an MRM 
as an ordinary transition system, i.e., a Kripke structure, where the edges are 
equipped with probabilistic timing information and the states are augmented 
with a real number that indicates the earned reward per unit of time for staying 
in a state. This then allows the usual interpretation of linear-time temporal 
operators like next step and unbounded or time-bounded until. 

MRMs. Let AP be a fixed, finite set of atomic propositions. 

Definition 1. A (labelled) CTMCC is a tuple (S', R, L) where S is a finite set of 
states, R : S X S — >■ the rate matrix, and L : S ^ 2^^ the labelling function 

which assigns to each state s € S the set L{s) of atomic propositions a G AP 
that are valid in s. A state s is called terminal (or absorbing^ ijf R(s,s') = 0 
for all states s' . 

Intuitively, R(s, s') > 0 iff there is a transition from s to s'; 1 — is 

the probability that the transition s ^ s' can be triggered within t time units. 
Thus the delay of transition s — >■ s' is governed by an exponential distribution 
with rate R(s, s'). If R(s, s') > 0 for more than one state s', a competition 
between the transitions exists, known as the race condition. The probability to 
move from non-absorbing s to s' within t time units, i.e., s — >■ s' to win the race, 
is given by 

. (i_ 

E(s) W J 

where E(s) = R(s,s') denotes the total rate at which any transition 

emanating from state s is taken. More precisely, E(s) specifies that the proba- 
bility of leaving s within t time-units is 1 — because the minimum of 

exponential distributions, competing in a race, is characterised by the sum of 
their rates. Consequently, the probability of moving from a non-absorbing state 
s to s' by a single transition, denoted P(s, s'), is determined by the probability 
that the delay of moving from s to s' finishes before the delays of other outgoing 
edges from s; formally, P(s,s') = R(s, s')/E(s). For absorbing states, the total 
rate E(s) = 0; we then have P(s, s') = 0 for any state s'. 



On the Logical Characterisation of Performability Properties 783 



Definition 2. A (labelled) MRM A4 is a pair (C,p) where C is a (labelled) 
CTMC, and p : S ^ is a reward structure that assigns to eaeh state s € S 
a reward p{s), also ealled gain or bonus or dually, east. 



Example 1. As a running example we consider a fault-tolerant multipro- 
cessor system inspired by US]. The system consists of three processors, 
three memories, and a single interconnection network that allows a pro- 
cessor to access any memory. We model this system by a CTMC, de- 
picted below, where state (t,j, 1 ) models that i processors and j memories 
(1 ^ i,j < 4) are operational and are connected by a single network. Ini- 
tially all components are functioning correctly, i.e., the initial state is (3,3,1). 
The minimal operational config- 
uration of the system is ( 1 , 1 , 1 ). 

The failure rate of a processor is 
A, of a memory p, and of the net- 
work 7 failures per hour (fph). 

It is assumed that a single re- 
pair unit is present to repair all 
types of components. The ex- 
pected repair time of a proces- 
sor is 1 /v and of a memory 1 /rj 
hours. In case all memories, all 
processors, or the network has 
failed the system moves to state 
F. After a repair in state F, we 
assume the system to restart in 
state (3, 3, 1) with rate S. 

The reward structure can be instantiated in different ways so as to spec- 
ify a variety of performability measures. The following reward structures are 
taken from m- The simplest reward structure (leading to an availability 
model) divides the states into operational and non-operational states: pi{F) = 
0 and pi(i,j, A:) = 1. A reward structure in which varying levels of performance 
of the system are represented is for instance based on the capacity of the system: 
P 2 {F) = 0 and P 2 {i,j,k) = min{i,j). The third reward structure does consider 
processors contending for the memories, by taking as reward for operational 
states the expected available memory bandwidth: ps{F) = 0 and ps{i,j,k) = 
TO • (l — (1 — 1/to)*) where I = min{i,j) and to = max{i,j). ■ 

Let Ai = {C,p) be an MRM with underlying CTMC C = (S', R, L). 

Paths. An infinite path u is a sequence soi Aq, si, ti, S 2 , ^ 2 , . . . with for i G IN, 
Si G S and E G IR>o such that R(si,Si+i) > 0. For f G W let a[i] = Si, the 
(i-l-l)-st state of (T, and S{a,i) = ti, the time spent in s^. For t G H 70 Etnd i 
the smallest index with t ^ S}=o ~ state in a at time t. For 

t = tj + t' with t' tk we define y{a, t) = Y!)ZI tj ' p{sj) + t' ■ p{sk), the 

cumulative reward along a up to time t. 
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A finite path cr is a sequence Si,ti, S 2 ,t 2 , ■ ■ ■ si where s; is ab- 

sorbing, and R(sj,Si_|_i) > 0 for all i < 1. For finite cr, a[i] and S{a,i) are 
only defined for i ^ 1; they are defined as above for i < I, and 6{a,l) = oo. 
For t > cumulative reward y{u,t) = 

■ P{^j) + (^ ~ X)j=o^i) ■ P('5/)i for the other cases, a@t and y{(T,t) are 
defined as above. 

Let Path^(s) denote the set of (finite and infinite) paths starting in s. 

Borel space. Any state s = Sq yields a probability measure Pr on Path^(s) 
as follows. Let sq, . . . ,Sk G S with R(sj, Si+i) > 0, (0 ^ t < k), and Iq, . . . , Ik-i 
non-empty intervals in IR^o- Then, C{sq, Iq, . . . , Ik-i, Sk) denotes the cylinder 
set consisting of all paths a G Path'^(s) such that a[i] = Si {i ^ k), and 
S(fT,i) € li (i < k). Let P(Path^(s)) be the smallest cr-algebra on Path^(s) 
which contains all sets C(s, /q, . . . , Ik-i, Sfc) where sq, . . . ,Sk ranges over all state- 
sequences with s = So, R(si, Si+i) > 0 (0 ^ f < fc), and /q, . . . , Ik-i ranges over 
all sequences of non-empty intervals in IR^o- The probability measure Pr on 
lF(Path^(s)) is the unique measure defined by induction on k: Pr(C(so)) = 1, 
and for k^ 0, 



Pr(C(so, . . . , Sfe, I', s')) = Pr(C(so, . . . , Sfe)) • P(sfe, s') ■ , 

where a = inf/' and b = sup/'. (For b = oo and A > 0 let = 0.) Note 

that is the probability of leaving state Sk in the interval /'. 

Remark. For infinite paths we do not assume time divergence. Although such 
paths represent “unrealistic” computations where infinitely many transitions are 
taken in a finite amount of time, the probability measure of such Zeno paths is 
0. This justifies a lazy treatment of the notations a@t and y{cr,t) when we refer 
to the probability of a measurable ste of paths. ■ 

Steady-state and transient probabilities. For a CTMC C two major types of 
state probabilities are distinguished: steady-state probabilities where the system 
is considered “on the long run”, i.e., when an equilibrium has been reached, and 
transient probabilities where the system is considered at a given time instant t. 
Formally, the transient probability 

n'"{s,s',t) = Pr{cr G Path^{s) \ a@t = s'} 

stands for the probability to be in state s' at time t given the initial state s. 
Note that this set is measurable. Steady-state probabilities are defined as 

7t‘'(s,s')= lim 7 t‘'(s, s', f). 
t —¥00 



This limit always exists for finite CTMCs. For S' C S, tt^{s,S') = X^s'gS' 
7t‘'(s, s') denotes the steady-state probability for set S'. In the sequel, we will 
often use Ai rather than C (the underlying CTMC of Ai) as superscript. 
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3 Stochastic CTL with Time and Rewards 



This section introduces a stochastic logic to reason about reward-based as well 
as time-based constraints, and identifies two important sub-logics of it. For ex- 
planatory purposes, we first introduce a simple branching time logic without any 
support for real time or reward constraints. 

Basic logic. The base stochastic logic SL, a stochastic variant of CTL (Com- 
putational Tree Logic), is a continuous-time variant of PCTL jjj. 

Syntax. For a G AP, p G [0, 1] and cc G {^,<,^,>}, the state- formulas of SL 
are defined by the grammar 



<P ::= tt 



a 



^ A ^ 









'P ixip(¥^) 



where path-formulas are defined by p ::= X<P 



<PU<P. 



Other boolean connectives such as V and — >■ are derived in the obvious way. 
As usual <>>P = ttU<P and the D-operator can be obtained by, for example. 

The state-formula S^p(<P) asserts that the steady- 
state probability for the set of ^-states meets the bound [xi p. For the running 
example, the formula S^Q,s{2pup) expresses that the steady-state probability to 
be in a state with two operational processors is at least 0.8 where 2pup holds 
in state (2, j, 1), 1 < j < 4. The operator Pixip(.) replaces the usual CTL path 
quantifiers 3 and V. Pmp('F) asserts that the probability measure of the paths 
satisfying p meets the bound cxi p. For example, P^o.siOF) denotes that the 
probability to eventually reach the failure state of the multi-processor system is 
at least 0.3. 



Semantics. The SL state-formulas are interpreted over the states of a CTMC 
C = (5, R, L) (or an MRM M with underlying CTMC C) with proposition 
labels in AP. Let Sat^{<!>) = {sGS'|s|=^}. 

s ^ tt for all s G S' s |= A ^2 iff s H for i=l, 2 

s^a iff a G L(s) s |= S|xip(f^) iff 7r‘"(s, Sat‘'(<?)) to p 

s )= - 1 ^ iff s ^ ^ s 1 = V^p{p) iff Prob‘d {a, <p) to p 

Here, Prob^ {a, (p) denotes the probability measure of all paths satisfying <p given 
that the system starts in state a, i.e., 

Prob^{a,(p) = Pr{ (7 G Path^(s) I cr 1= p }. 

The fact that the set { cr G Path‘s (a) | a ^ p} is measurable can be easily 
verified. The intended meaning of the temporal operators U and X is standard: 



a ^ X^ iff a[l] is defined and <j\\] ^ ^ 

a \= iff 3fc ^ 0. (cr[fc] ^ A VO ^ i < fc. a[i] \= <?i). 

Alternative characterisations. For next-formulas we have, as for DTMCs |Zj: 

Prob‘d {a, X^) = P{a,^) ( 1 ) 
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where P(s,^) = the probability to reach a <?-state in one 

step from s. For until-formulas we have that the probability Prob^ {s,<l>iU <1>2) 
is the least solutiorflof the following set of equations: Prob‘d {s,<Pi LI <^ 2 ) equals 
1 if s ^ ^ 2 , equals 

Pis,s) ■ Prob‘d (s,^iU^2) ( 2 ) 

s'gS 

if s 1= A - 1 ^ 2 , and 0 otherwise. This probability can be computed as the solu- 
tion of a regular system of linear equations by standard means such as Gaussian 
elimination 0 or can be approximated by an iterative approach. 

The full logic. We now extend SL by providing means to reason about both 
time constraints and cumulative reward constraints. We refer to this logic as 
CSRL. Later we will identify fragments of CSRL that refer to only time, re- 
spectively only reward constraints. 

Syntax. The syntax (and semantics) of the state formulas of CSRL are defined 
as for the basic logic. Path-formulas (p are defined for intervals I,J^ lR>o by: 









<PLlj^. 



In a similar way as before, we define Oj^ — and = 

Interval I can be considered as a timing constraint whereas J 
represents a bound for the cumulative reward. The path-formula Xj asserts 
that a transition is made to a ^-state at time point t G / such that the earned 
cumulative reward r until time t meets the bounds specified by J, i.e., r G J. 
The semantics of Uj <p2 is as for <Pi U <^2 with the additional constraints that 
the ^ 2 -state is reached at some time point t in I and the earned cumulative 
reward up to t lies in J. As an example property for the multi-processor system, 
denotes that with probability at least 0.95 the cumulative re- 
ward (e.g., the expected capacity of the system for reward structure P2) at time 
instant 60 is at most 2. Given that the reward of a state indicates the number 
of jobs processed per time-unit, property P^q gg(3niupL/l^'^j nidown) expresses 
that with probability at least 0.98 at least 7 jobs have been processed (starting 
from the initial state) before the first memory unit fails within 30 time units, 
where 3mup is valid in states (i, 3, 1), 1 ^ i < 4 and mdown is valid in states 
(*,2,1), 0 * < 4. 

Semantics. The semantics of the CSRL path-formulas is defined as follows: 

a fy Xj iff cr[l] is defined and cr[l] fy ^ A S(a, 0) € I A 6{a, 0)) G J 
cr \= (piU^ <L >2 iff G I. {a@t fy ^2 A (VF G [0, t).a@t' |= <?i) A y{a,t) G J). 

Special cases occur for I = [0, 00 ) and J = [0, 00 ): 

and W <Z >2 = <Pi <[>2. 

^ Strictly speaking, the function s 1 — >■ Prob‘d {s, <Pi U $ 2 ) is the least hxpoint of a higher- 
order function on {S — ^ [0, 1]) — ^ (-S' — ^ [0, 1]) where the underlying partial order on 
S -)■ [0, 1] is defined for Fi, Fa : 5 [0, 1] by Fi Fa iff Fi(s) ^ Fa(s) for all s G S'. 



On the Logical Characterisation of Performability Properties 787 



Thus, SL is a proper subset of this logic. The logic CSL m (or, timed stochastic 
CTL) is obtained in case J = [0, oo) for all sub-formulas. Similarly, we obtain 
the new logic CRL (reward-based stochastic CTL) in case I = [0,oo) for all 
sub-formulas. In the sequel, intervals of the form [0, oo) are often omitted from 
the operators. 

We recall that t/(cr, t) denotes the cumulative reward along the prefix of a up 
to time t. The intuition behind y{cr, t) depends on the formula under considera- 
tion and the interpretation of the rewards in the MRM Ai under consideration. 
For instance, for ip = Ogood and path cr that satisfies Lp, the cumulative reward 
y{a,t) can be interpreted as the cost to reach a good state within t time units. 
For ip = Obad, it may be interpreted as the gain earned before reaching a bad 
state within t time units. 

Alternative characterisations. We first observe that it suffices to consider time 
and reward bounds specified by closed intervals. Let K = {x G I \ p{s) ■ x G J} 
for closed intervals / and J. The probability of leaving state s at some time point 
X within the interval / such that the earned reward p{s) ■ x lies in J is can be 
expressed by 

PS(s) = [ E(s) • dx. 

Jk 

For instance, P[Q’fa)(s) = 1— the probability to leave state s within t 
time units where the reward earned is irrelevant. If p{s) = 2, / = [I, 3] and 
J = [9, II] then K = 0 and P5 (s) = 0- For Xj <P we obtain: 

Prob^{s,X^j<l>) = P^j{s)-P{s,<l>). 

For the case / = J = [0,oo) this reduces to equation (P). 

Let I Q X denote {t—x \ t G I,t ^ For p = <I>iUj<l >2 we have that 
Prob‘d (s, p) is the least solution of the following set of equations: Prob'^ {s,p) = 
1 if s ^ - 1^1 A d> 2 , inf / = 0 and inf J = 0, 

^sup K 

/ Y.^{s,s',x)■Prob^{s',<d>^U^J%l^^y^<l>2)dx (3) 

•^0 s'es 



if s 1= <^i A ->^ 2 , and 







^ P(s,s',x) • Prob^{s\<PiUj%;^^y^<P 2 ) dx 

s'es 



if s 1= A <l> 2 , and 0 otherwise, where P(s, s',a;) = R(s, s') • denotes 

the probability of moving from state s to s' within x time units. The above 
characterisation is justified as follows. If s satisfies <Pi and -■^ 2 , the probability 
of reaching a ^ 2 -state from s within the interval I by earning a reward r G J 
equals the probability of reaching some direct successor s' of s within x time 
units {x ^ sup / and p{s) ■ x ^ sup J, that is, x ^ sup/sT), multiplied by the 
probability of reaching a ^ 2 -state from s' in the remaining time interval I Q x 
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while earning a reward of r—p{s) • x. If s satisfies A <l> 2 i the path-formula p 
is satisfied if no transition outgoing from s is taken for at least inf K time units 
(first summand) 0 Alternatively, state s should be left before inf AT in which 
case the probability is defined in a similar way as for the case s ^ A - 1^2 
(second summand). Note that inf AT = 0 is possible (if e.g., inf J = inf / = 0). In 
this case, s |= A yields that any path starting in s satisfies Uj ^2 and 
Prob^{s,^iU^j^2) = 1 . 

If the reward constraint is trivial, i.e., J = [0, 00 ), and I is of the form [0,t] 
for t G M^O) then the characterisation for reduces to the least solution of 
the following set of equations: Pro&^(s, ^ 2 ) equals 1 if s ^ <p 2 , equals 

f ^ P(s,s',x) •Pro 6 ^(s',^>iW[°’*-^l^> 2 ) dx (4) 

if s 1= A -'<^ 2 , and 0 otherwise. This coincides with the characterisation for 
time-bounded until in Q. For the special case I = J = [0,oo) we obtain AT = 
[ 0 ,oo) and hence the characterisation for 14^ reduces to (| 21 ). 

4 Duality 

In this section we present the main result of the paper, a duality theorem that 
has important consequences for model checking sub- logics of CSRL. The basic 
idea behind this duality, inspired by 0 , is that the progress of time can be 
regarded as the earning of reward and vice versa. First we obtain a duality 
result for MRMs where all states have a positive reward. After that we consider 
the (restricted) applicability of the duality result to MRMs with zero rewards. 

Transformation of MRMs. Let A4 = {8,11, L, p) be an MRM that satisfies 
p{s) > 0 for any state s. Define MRM = {8, R', L, p') that results from M 
by: (i) rescaling the transition rates by the reward of their originating state (as 
originally proposed in g), i.e., R'(s,s') = R{s,s')/p{s) and, (ii) inverting the 
reward structure, i.e., p'{s) = l/p{s). Intuitively, the transformation of M into 
stretches the residence time in state s with a factor that is proportional to 
the reciprocal of its reward p{s) if p{s) > 1 , and it compresses the residence time 
by the same factor if 0 < p{s) < 1. The reward structure is changed similarly. 
Note that M. = (Af“^)“^. 

One might interpret the residence of t time units in as the earning of t 

reward in state s in A4, or (reversely) an earning of a reward r in state s in Al 
corresponds to a residence of r in . Thus, the notions of time and reward 
in Al are reversed in A4~^. Accordingly: 

Lemma 1. For MRM Al = {8,R, L, p) with p{s) > 0 for all s G 8 and CSRL 
state-formulas <P, <l>i and <p 2 : 

1. Prob^ {s, Aj <P) = Prob^^^ (s. A/ <P) 

By convention, inf 0 = oo. 



2 
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2. Prob^ (s, ^>1 <?2) = Prob^ ' (s, <?i W/ ^2) • 

We informally justify 2. for I = [0,f] and J = [0,r] with r, t £ IR^o- Let 
MRM Af = (S', R, L,p) with p(s) > 0 for all s € S. Let s G S be such that 
s 1= A -i<?2- From equation m we have that Prob^ (s,<?iW/ ^2) equals 

f Y. P(sA',a;) • dx. 

s'as 



for iC' = { X G [0,t]|p'(s)-x G [0, r] }, i.e., K' = [0, min{t, ^7^)]- By the definition 
of this equals 




R(s,sQ 

P(s) 



E(s) 



Prob^ ‘(s',^>iW 



(h„'\ dx 



By substitution y = this integral reduces to: 



f «') • ■ Prob^^' (s', <?! $ 2 ) dy 

s'eS 



where K = [0, mm(^^, r)]. Thus, the function that maps (s,/, J) onto 

Prob'^ {s,<PiUj <^2) meets the fixed point equation for Prob'^ {s,(b>iUj <p 2 )- 
Using arguments of fixed point theory, i.e., Tarski’s theorem for least fixed points 
of monotonic functions on lattices, it can be shown that these fixed points agree 
(as they both are the least fixed point of the same operator). Thus, we obtain 



/ ^ P(s,s',y) • Prob^is' dy 
s'dS 

and this equals Prob‘d {s,<l>iUj<p 2 ) for s |= A -■^2) cL (0- 

For CSRL state-formula <P let be defined as <P where for each sub- 
formula in <P of the form Xj or Uj the intervals I and J are swapped. 
This notion can be easily defined by structural induction on <1> and its def- 
inition is omitted here. For instance, for <P = P^o.9(“'F^po’^j F) we have 

= F^o. 9 (“'FWj^ 5°^ F). We now have: 

Theorem 1. For MRM A4 = (S', R, L,p) with p{s) > 0 for all s £ S and 
CSRL state-formula <P: 

Satd^{<F) = Satd^^\<p-^). 

If Ai contains states equipped with a zero reward, this duality result does not 
hold, as the reverse of earning a zero reward in M. when considering <F should 
correspond to a residence of 0 time units in for which — as the advance 

of time in a state cannot be halted — is in general not possible. However, the 
result of Theorem [0 applies to some restricted, though still practical, cases, viz. 
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if (i) for each sub-formula of of the form Xj<P' we have J = [0,oo), and 
(ii) for each sub-formula of the form <PiUj<p 2 we either have J = [0, oo) or 
C { s G S' I p{s) > 0 }, i.e., all <?i-states are positively rewarded. The 
intuition is that either the reward constraint (i.e., time constraint) is trivial in 
<P (in or that zero-rewarded states are not involved in checking the reward 

constraint. Here, we define by setting R'(s, s') = R(s, s') and p' {s) = 0 in 
case p{s) = 0 and as defined above otherwise. For instance. Theorem 0 applies 
to the property ’^>o. 9 (“'f’^po’^j F) for the multi-processor example, since all 
-■F-states have a positive reward. 



5 Application of the Logic 

In this section, we discuss model checking of CSRL. We furthermore illustrate 
that CSRL and its fragments CSL and CRL provide ample means for the 
specification of performability measures. 

Model checking. CSL model checking can be carried out in the following 
way. 5|xip(?^) gives rise to a system of linear equations for each bottom strongly 
connected component of the graph underlying the CTMC 0. The probability 
to satisfy U - and X-path formulas can be obtained as the solution of a system 
of linear equations, resp. a single matrix- vector multiplication jZj, based on (W 
and ( 0 . Finally, the probability to satisfy a -formula can be obtained as the 
solution of a system of Volterra integral equations 0), that can be computed 
by either numerical integration |2j, or transient analysis of the CTMC |2|. From 
Theorem Q1 we can conclude that model checking an MRM against a CRL- 
formula can be performed using the algorithms established for model checking 
CTMCs against CSL: 

Corollary 1. For an MRM without any zero rewards, model ehecking CRL is 
reducible to model checking CSL. 

In a number of interesting, albeit restricted cases (cf. SecEJ, the corollary carries 
over to MRMs with zero rewards. The duality theorem does not provide an algo- 
rithmic recipe for CSRL, but a direct solution using numerical integration can 
be constructed based on the fixpoint characterisation for Uj . An investigation of 
the feasibility of applying known efficient performability evaluation algorithms 
to model checking CSRL is ongoing. 

Typical performability measures. Performability measures that frequently 
appear in the literature, e.g., ESI, can be specified by simple CSRL-formulas. 
This is illustrated by Table Q] where we listed a (non-exhaustive) variety of typ- 
ical performability measures for the multi-processor system together with the 
corresponding CSRL formulas. Measure (a) expresses a bound on the steady- 
state availability of the system and (b) expresses (a bound on) the probability 
to be not in a failed state at time t, i.e., the instantaneous availability at time t. 
Measure (c) expresses the time until a failure, starting from a non-failed state. 
Evaluating this measure for varying t, gives us the distribution of the time to 
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Table 1. Performability measures and their logical specification 





performability measure 


formula 


logic 


(a) 


steady-state availability 


5xp(“'F) 


SL 


(b) 


instantaneous availability at time t 


Pxp(0>*’d-,ir) 


CSL 


(c) 


distribution of time to failure 


Pxp(^FW'°’d F) 


CSL 


(d) 


distribution of reward until failure 


Ptxlp{-^FU,Qy. F) 


CRL 


(e) 


distribution of cumulative reward until t ’Pxp(o[Q’)J]tt) 


CSRL 



failure. Measure (d) complements this by expressing the distribution of the re- 
ward accumulated until failure. Measure (e) generalises (c) and (d) by expressing 
the simultaneous distribution of the accumulated reward against time, i.e., it ex- 
presses the probability for the reward accumulated at t to be at most r. This 
measure coincides with the performability distribution as proposed in the sem- 
inal paper m- Note that for the computation of all these measures efficient 
algorithms do exist jOj. We emphasize that, in its full generality, CSRL allows 
to specify much more complex performability measures than previous ad hoc 
methods. 

A possible extension of CSRL. Consider state s in MRM Ai. For time t and 
set of states S", the instantaneous reward p^{s. S' , t) equals ‘ 

p{s') and denotes the rate at which reward is earned in some state in S' at time t. 
The expected (or long run) reward rate p'^(s, S') equals J^s'eS' 

We can now add the following operators to our framework: 

s h £j{^) iff P^{s, Sat^i-P)) G J 
s\=£(,{P) iS p^{s,Sat^{P),t) & J 
s\=C(,{P) iff JjP^{s,Sat^{P),u) due J 

Although the duality principle is not applicable to the new operators, their 
model checking is rather straightforward. The first two formulas require the 
summation of the <?-conforming steady-state or transient state probabilities (as 
computed for measure (a) and (b)) multiplied with the corresponding rewards. 
The operator C){P) states that the expected amount of reward accumulated in 
^-states during the interval I lies in J. It can be evaluated using a variant of 
uniformisation 0E]. Some example properties are now: £j{-'F), which expresses 
the expected reward rate (e.g., the system’s capacity) for an operational system, 
£j{tt) expresses the expected instantaneous reward rate at time t and Cj’*^(tt) 
expresses the amount of cumulated reward up to time t. 

6 Concluding Remarks 

We introduced a continuous-time, reward-based stochastic logic which is ade- 
quate for expressing performability measures of a large variety. Two important 
sub-logics were identified, viz. CSL m, and the novel logic CRL that allows 
one to express reward-based properties. The main result of the paper is that CSL 
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and CRL are complementary, implying that CRL-properties for a Markov re- 
ward model can be interpreted as CSL-properties over a derived CTMC, so that 
existing model checking procedures for CSL can still be employed. The model 
checking of the full logic CSRL, in particular properties in which time- and 
reward-bounds are combined, is left for future work. 

Acknowledgement. We thank the reviewers for their helpful comments. 
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Abstract. In this paper we investigate timed polyhedra, i.e. polyhedra 
which are finite unions of full dimensional simplices of a special kind. 
Such polyhedra form the basis of timing analysis and in particular of 
verification tools based on timed automata. We define a representation 
scheme for these polyhedra based on their extreme vertices, and show 
that this compact representation scheme is canonical for all [convex and 
non-convex) polyhedra in any dimension. We then develop relatively ef- 
ficient algorithms for membership, boolean operations, projection and 
passage of time for this representation. 



1 Introduction and Motivation 

Timed automata, automata augmented with clock variables unni, has proven 
to be a very useful formalism for modeling phenomena which involve both dis- 
crete transitions and quantitative timing information. Although their state-space 
is non-countable, the reachability problem, as well as other verification, synthesis 
and optimizations problems for timed automata are solvable. This is due to the 
fact that the clock space admits an equivalence relation (time-abstract bisimu- 
lation) of finite index, and it is hence sufficient to manipulate these equivalence 
classes, which form a restricted class of polyhedra that we call timed polyhedra. 

Several verification tools for timed automata have been built during the last 
decade, e.g. Kronos iDOrY96IY97j . Timed Cospan |XK9^ and Uppaal [ir.PY97j . 
and the manipulation of timed polyhedra is the computational core of such tools. 
Difference bound matrices (DBM) are a well-known data-structure for repre- 
senting convex timed polyhedra, but usually, in the course of verification, the 
accumulated reachable states can form a highly non-convex set whose repre- 
sentation and manipulation pose serious performance problems to these tools. 

* This work was partially supported by the European Community Esprit-LTR Project 
26270 VHS (Verification of Hybrid systems) and the French-Israeli collaboration 
project 970maefut5 (Hybrid Models of Industrial Plants). 
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Consequently, the search for new representation schemes for timed polyhedra 
is a very active domain of research |ABK+97| . EMEzna, Enni, hemmei, 

iRT.p+ 991 . iwnm . 

In this paper, we propose a new representation scheme for non-convex timed 
polyhedra based on a reformulation and extension of our previous work in 
IfiMPlhil where we proposed a canonical representation for non-convex orthogo- 
nal polyhedra. As in |IjlVlP99| the representation is based on a certain subset of 
the vertices of the polyhedron. The size of our canonical representation is 0{nd\) 
where n is the number of vertices and d is the dimension. Based on this rep- 
resentation we develop relatively-efhcient algorithms for membership. Boolean 
operations, projection and passage of time on arbitrary timed polyhedra of any 
dimension. In order to simplify the presentation we restrict the discussion in this 
paper to full- dimensional timed polyhedra, but the results can be extended to 
treat unions of polyhedra of varying dimension. 

The rest of the paper is organized as follows: in section 2 we define orthogonal 
polyhedra and give new proofs of the main results from |BMP99j concerning their 
representation by extreme vertices. In section 3 we introduce timed polyhedra 
and prove that they can be represented canonically by their extreme vertices. In 
section 4 we discuss Boolean operations, projections, and the calculation of the 
effect of time passage. Finally we mention some related work and future research 
directions. 



2 Griddy Polyhedra and Their Representation 



Throughout the paper we assume a d-dimensional K.- vector space. In this section 
we also assume a fixed basis for this space so that points and subsets can be 
identified through their coordinates by points and subset of The results of 
this section are invariant under change of basis0 and this fact will be exploited 
in the next section. 

We assume that all our polyhedra live inside a bounded subset X = [0, m]'^ C 
(in fact, the results hold also for We denote elements of X as x = 
{x\, . . . , Xd), the zero vector by 0 and the vector (1,1,..., 1) by 1. The elemen- 
tary grid associated with X is G = {0, 1 . . . , m — C For every point x G 
X, [xj is the grid point corresponding to the integer part of the components of 
X. The grid admits a natural partial order defined as (xi, . . . , Xd) < (x{, . . . , x'^) 
if for every i, Xi < a;'. The set of subsets of the elementary grid forms a Boolean 
algebra (2‘^,n,U,~) under the set-theoretic operations. 



Definition 1 (Griddy Polyhedra). Let x = (xi, . . . ,Xd) he a grid point. The 
elementary box associated with x is the closed subset of X of the form B(x) — 
[xi, Xi -I- 1] X [x 2 , X 2 -I- 1] X . . . [xd, Xrf-I- 1] . The point x is called the leftmost corner 
of B(x). The set of boxes is denoted by B. A griddy polyhedron P is a union of 
elementary boxes, i.e. an element o/2®. 



Griddy polyhedra were called orthogonal polyhedra in |A A9SIBMP99) . We prefer 
here the term griddy since the results do not depend on an orthogonal basis. 
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Although 2^ is not closed under usual complementation and intersection, it is 
closed under the following operations: 

AU B = AU B An B = cl{int{A) n int{B)) —^A = cl{^ A) 

(where cl and int are the topological closure and interior operation^. The 
bijection B between G and B which associates every box with its leftmost corner 
clearly induces an isomorphism between (2‘^,n,U,^) and (2^, FI, □,-■). In the 
sequel we will switch between point-based and box-based terminology according 
to what serves better the intuition. 

Definition 2 (Color Function). Let P he an griddy polyhedron. The color 
function c : A — >■ {0, 1} is defined as follows: if y: is a grid point then c(x) = 1 
iff B{x) C P; otherwise, c(x) = c([xj). 

Note that c almost coincides with the characteristic function of P as a subset of 
X. It differs from it only on right-boundary points (see Figure E-(a)). 




(a) 



JV(x) 













(b) 



Fig. 1. (a) A griddy polyhedron and a sample of the values of the color function it 
induces on X. (b) The neighborhood and predecessors of a grid point x. 



Definition 3 (Predecessors, Neighborhoods and Cones). In the following 
we consider x to be a grid point x = (si, . . . , Xd). 

— The i-predecessor of x is x'^* = {x\, . . . ,Xi — 1, . . . ,Xd). We use x'^b ^s a 
shorthand for (x'*“*)'*“-l. 

— The neighborhood ofx is the set Af(x) = {a;i — 1, x\\ x . . . x . . . {xd — 1, Xd}, 
i.e. the vertices of a box lying between x — 1 and x, (Figure ^(b)). 

— The backward cone based of x js ,>x = {y S G : y < x} (Figure Q-(a)). 

— The forward cone based at x is x"^ = {y G G : x < y} (Figure 0(b)) . 

See for definitions. 
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Every grid point x is contained in all the forward cones such that y £ i>x. 

Let © denote addition modulo 2, known also as the exclusive-or (xor) op- 
eration in Boolean algebra, p ® q = {p A -•q) V (-ip A q). We will use the 
same notation for the symmetric set difference operation on sets, defined as 
A © B = {x : (x £ A) © (x £ B)}E This is an associative and commutative 
operation, satisfying A © = 0 on numbers, and A® A — 0 on sets. We will 

show that every griddy polyhedron admits a canonical decomposition into a xOR 
of cones. 





(a) 



(b) 



Fig. 2. (a) The backward cone based at x. (b) The forward cone based at x. 



Theorem 1 (©-representation). For any griddy polyhedron P there is a uni- 
que finite set V of grid points such that P = 0v^ 

vev 

Proof. First we show existence of V. Let ^ be a total order relation on G, xi ^ 
X 2 ^ . . .x,„d, which is consistent with the partial order < (any lexicographic 
or diagonal order will do). By definition, x ^ y implies y ^ |>x. The following 
iterative algorithm constructs V. We denote by Vj the value of V after j steps 
and by Cj the color function of the associated polyhedron Pj = 0v<. 

Vo-.=<D 

for j = 1 to m‘^ do 

if c(xj) fy Cj_i(xj) 

then V:=V U {x^} 

end 

This algorithm is correct because at the end of every step j in the loop we 
have Cjfxk) = c(x^) for every k < j. It holds trivially at the beginning and in 



® Note that on 2®, T © should be modified into d{int{A) © int{B)). 
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every step, the color of for all k < j is preserved (because their color is not 
influenced by the cone x|) and updated correctly for Xj if needed. 

For uniqueness we show that if there are two sets of points U and V such 
that 

vev ueu 

then U = V. Indeed, by the properties of 0, we have 

0 = P©P=0v^©0u^= 0 

vGV use/ y^U®V 

(if a point v appears in both U and V it can be eliminated because v^©v^ = 0). 
The set denoted by the rightmost expression is empty only iiU®V = 0, otherwise 
any minimal vertex v in [/ © R belongs to it. □ 

Remark: Since every set of points defines a ©-formula and a polyhedron, we 
have an interesting non-trivial bijection on subsets of G. Note that for the chess 
board V = G. 

Let V : G — >■ {0, 1} be the characteristic function of the set 17 C G. 

Observation 1. For every point x, c(x) = 0 V(y) 

y6 

This gives us immediately a decision procedure for the membership problem 
X G P: just check the parity of 17 n ,>x. In fBMP99j we have shown that the 
elements of 17 are those vertices of P that satisfy some local conditions. In the 
following we give an alternative proof of this fact, which requires some additional 
definitions to facilitate induction over the dimensions. 

Definition 4 (/c-Neighborhood and fc-Backward Cone). Let 

X = {xi, . . .,Xd). 

— The k -neighborhood of yi is the set 

N^{y:) = {xi - 1, a:i} X . . . X {xk - 1, Xk} x {xk+i} x ... x {a:^}. 

— The k-backward cone ofx is the set 



M'=(x) = {xi} X . . . X {xk} X {0, ... , Xk+i} X ... X {0, ... , Xd}. 

Note that N^{x) = {x}, 7V‘^(x) = jV(x), M°(x) = ^x and M'^(x) = {x} (see 
Figure E|). 

Observation 2. For every k > 0, 

= and M'=(x) = M'=-1(x) © M'=-^(x^'=). 
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Fig. 3. The fc-backward cone and the fc-neighborhood for fc = 0, 1, 2. 



Theorem 2 (Extreme Vertices). A point x is a basis for a cone in the canon- 
ical decomposition of a griddy polyhedron P if and only if x is a vertex of P 
satisfying 

0 c(y) = 1 (1) 

y6Af(x) 

Proof. First we prove that for every x S G and every k, 0 < k < d 

0 c(y) = 0 V(y). (2) 

y£N^{x) yGM^(x) 

The proof is done by induction on k. For fc = 0 it is just a rephrasing of obser- 
vation Q1 Suppose it holds for k — 1. By summing up the claims for x and for 
x'*"^ we obtain 

0 c(y) © 0 c(y) = 0 V(y) © 0 V(y) 

which, by virtue of Observation O and the inductive hypothesis, gives the result 
for k. Substituting A: = d in Q, we characterize the elements of V as those 
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satisfying condition O- In jHM P99| we have proved that these points constitute 
a subset of the vertices of P which we call “extreme” vertices, following a geo- 
metrical definition in |AA98| for d < 3. □ 

We review our main algorithmic results from (EHEnni, assuming a fixed 
dimension d and denoting by np the number of extreme vertices of a polyhedron 
P, which we assume to be sorted in some fixed order. 

Observation 3 (Boolean Operations). Let A and B be two griddy polyhedra. 

— The symmetric difference of A and B, A ® B, can be computed in time 
0{uA + np). 

— The complement of A, X — A, can be obtained in time 0(1). 

— The union, AVd B, and the intersection, AC\ B, can be computed in time 

0{nAnb). 

Proof. Computing A(BB is trivial and so is complementation using X — A = 0^© 
A. Observing that AUB = A®B(BAr\B, computing union reduces to computing 
intersection. Now, using recursively the identity {A(BB)C\C = {Ar\C)®{BC\C), 
write 

(0x?)n(0y/)= 0(x,^ny/)= 0max(x„yj)^ 

Xi Vj Xi,yj Xi,yj 

where max denotes the maximum component- wise. □ 

We use the following terminology from jHM P99| : 

Definition 5 (Slices, Sections and Cones). Let P be a griddy polyhedron, 
i S {1, . . . , d} and z G {0, . . . , m — 1}. 

— The {i,z)-slice of P is the d-dimensional polyhedron 
Ji,z{P) = P {x. \ Z < Xi < z -\- 1} . 

— The {i, z)-section of P is the (d — 1) -dimensional polyhedron 

Ji,z{P) = Ji,z{P) n {x : Xi = z}. 

— The i-projection of P is the (d— 1)- dimensional polyhedron 

%i{p) = {{xi,...,Xi-i,Xi+i,...Xd) : 3z (xi,...,Xi-i,z,Xi+i,...Xd) G P}. 

Observation 4 (Computation of Slice, Section and Projection). The 

computation of and of Ji^z can be performed in time 0(n). The projection 
TT^i{P) of P is computable in time 0{n^). 

Proof. Using identity {A® B)UC = {A r\ C) (B {B r\ C) , we have 

J,,,(0x<) = 0(J,,,(x^)). 
xGV xGV 

This gives an immediate algorithm for computing Ji^z{P) and, hence for com- 
puting Ji,z{P)- For projection write nn{P) = |J 7T4,i(Ji,j(P)). □ 
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3 Timed Polyhedra and Their Representation 

In this section we extend our results to timed polyhedra whose building blocks are 
certain types of simplices. Let U denote the set of permutations on {1, . . . , d}. 
We write permutations a G U as (cr(l) tj(2) ••• cr{d)). There is a natural 
bijection between II and the set of simplices occurring in a specific triangulation 
of the d-unit hypercube (sometimes called the Kuhn triangulation ra). Hence 
the correspondence between grid points and elementary boxes from the previous 
section can be extended into a correspondence between G x 77 and the set of 
such triangulations of the elementary boxes in X. 

Definition 6 (Elementary Simplices, Timed Polyhedra and Cones). Let 

w be a grid point and let a be a permutation. 

— The elementary simplex associated with (v,(t) is 

cr) = {x : 0 < - U„( 1 ) < X^( 2 ) - Va( 2 ) < ■■■< X„(d) ~ Va{d) < !}■ 

— A timed polyhedron is a union of elementary simplices. It can be expressed 
as a color function c : G x 77 — >■ {0, 1}. 

— The (j-forward cone based on a point v is 

(v, Cr)^ = {x : 0 < - Va(l) < X^( 2 ) - V^( 2 ) < ■ ■ ■ < - V^(d)}- 

These notions are illustrated in Figure El for d = 2. In the sequel we will use the 
word simplex both for a pair (v, cr) and for the set 77(v, cr). Note that for every 
V, 

07?(v,a) = 77(v) and 0(v,a)" = vT 

(T^IJ 



✓ 




0 1 2 3 4 5 6 



Fig. 4. The set of basic simplices in dimension 2, the simplices 7?((1, 1), (2 1)) and 
73((3, 0), (1 2)) and the forward-cones ((3, 2), (2 1))^ and ((1, 3), (1 2))^'. 
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Theorem 3 (©-Representation of Timed Polyhedra). For any timed poly- 
hedron P, there is a unique finite set V C G x II such that 

P= ^ (v,a)^ 

(v,cr)Gy 

Proof. The proof is similar to the proof of Theorem^ The only additional detail 
is the extension of the order on G into a lexicographic order on G x 7T. Since 
the cones ,>(x,(t) and i>(x,t) are disjoint for cr r, the properties used in the 
proof are preserved. □ 

Our goal for the rest of the section is to find a local characterization of the 
elements (v, a) of V, similar to property (CQ in Theorem 0 based on the parity 
of the colors on some suitably-defined neighborhood of (v, cr). 

Observation 5 (Decomposition of Timed Polyhedra). Any timed polyhe- 
dron P can be decomposed into 

a-en 

where each P„ is a griddy polyhedron in some basis of . 

Proof. By letting = {v : (v, cr) G 1/} we can rewrite P as 

^=0 0(v,^r = 0^.- 

cr^n vGVcr cr^n 

Each Per is a XOR of cr-cones, and hence it is griddy in the coordinate system 
corresponding to a which is related to the orthogonal system by the transfor- 
mation yer{l) = Xer{l) and yer(r) = Xer(i) ~ Xer{i- 1 ) for i > 2. □ 

We denote the color function of P„ by Co-. We call the corresponding grid the a- 
grid. In this grid the i-predecessor of x is denoted by x'^®, and the neighborhood 
of X by Ner{x) (see Figure 0). By definition (x,ct) G fo iff if x G Va, and, since 
Theorem 0 does not depend on orthogonality, we have: 

Observation 6. A simplex (x, a) occurs in V iff 0 ca{y,<j) = 1. 

y^N,{x) 

This definition is based on P^, not on P itself. We will show that this sum can 
be reduced to the sum of the values of c on a certain set of simplices ^(x, cr), to 
be defined below. 

Definition 7 (Permutation and Simplex Predecessors). With every i G 
{1, . . . , d} define the i-predecessor function : 77 — >■ 7T such that for every cr 
such that <j{k) = i 

{ a(j) if j < k 

O'O' + l)ifk<j<d 
i ifj = d 

The i-predecessor of a simplex (x, cr), for i G {1, . . . , <7} is defined as (x, cr)'*“® = 

(x^ber-^O- 
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(a) (b) (c) 

Fig. 5. (a) The orthogonal grid, (b) The (2 l)-grid and N (2 i)(x). (c) The (1 2)-grid 
and iV(i 2 )(x). 



In other words, is the permutation obtained from a by picking Xi and putting 
it at the end of the ordering. We denote by the successive application of 
the operator Note that unlike the analogue definition on the grid, the 

permutation predecessors is not commutative, i.e. ^ 

The fact that (y,r) = (x,cr)'*“® has the following geometrical interpretation: 
B(y,T) is the first simplex outside B(x) encountered while going backward in 
direction i from B{x,<j). We can lift these functions into — >■ 2^^^ 

in the natural way. The following definition is crucial for this paper. It specifies 
which neighborhood of a simplex determines its membership in V . 

Definition 8 (Simplex fc-Neighborhood). The simplex k -neighborhood of a 
simplex (x,ct) is the set of simplices defined recursively as: S'°(x, tr) = {(x, cr)} 
and 

5''=+i(x,cr) = S''=(x,a) U 

This notion is depicted in Figure El We write S for S'^. Note that unlike the 
definition of neighborhood for griddy polyhedra (Definition^, where the recur- 
sion over dimensions can be done in any order, here the order is important and 
depends on the permutation. 

We intend to show that 

0 Ca{y,(j)= 0 c(y,r). 

ySAf„(x) (y,T)eS(x,cr) 



Definition 9 (Close Permutations). With every a G II and every i, Q <i < 
d, we associate a set of permutations defined recursively as 77° = 77 and 

77^+1 = 77 ; n {r : a{f) = r(*)}. 

In other words, 77^ is the set of permutation that agree with a on the first i 
coordinates. 
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Fig. 6. The simplex fc-neighborhoods of two simplices. 



Note, of course, that 7T^ = {a}. With every such set of permutations we associate 
the polyhedron = 0 Pr and let denote its corresponding color function. 

For every k, 0 < k < d, let pki^jc) denote the quantity 

Pfc(x,CT)= 0 

Observation 7 (Fundamental Property). Let (v,ct) be a simplex, and let r 
be a permutation in for some i € {i, . . . ,d}. Then: 

— Cr((v, = Ci-(v, cr) when r ^ 77* . 

- c^((v,cr)^'"W) = c^(v*^‘^W,cr) when r G Pf. 

Proof. Working in the coordinate system on which is griddy, i.e. ya{i) = 
and j/cr(i) = a^cr(i) ~ a;CT(i_i) for i > 2, it is easy to see that (v,<t) and (v, 
are both included in a same elementary box when r ^ Ilf, and are included in 
two different consecutive boxes in direction a{i) when r G Ilf. □ 
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Observing that any simplex (v, r) S 5'^(x, cr) satifies r G we obtain, 

for every k\ 

Pfe+i(x, cr) = P/c(x,(t) (3) 

Theorem 4 (Main Result). A cone (x,(t)^ occurs in the canonical decompo- 
sition of a timed polyhedron P iff x is a vertex of P satisfying condition 

0 c(y,r) = l (4) 

(y.'r)GS‘*(x,dr) 



Proof. From 10) we have that Pd{^, cr) corresponds to 0 Ccr(y) ct), and hence 

yGAf„(x) 

it indicates the extremity of (x,(t). 

It remains to prove that an extreme point is a vertex by induction over d. 
Case d = 1 is immediate by a systematic inspection. Now, for d > 1, S'(x,ct) 
is the union of S~^ = S"^“^(x,cr) and S~ = (S"^“^(x, . Observing that 
these two sets are separated by hyper-plane H of equation Xo-(i) = Xcr(i) and 
that the sum of c on S~^ and S~ must differ, x must belong to a facet included 
in H. Interchanging, if necessary, 5'“'' and S~ , assume that sum of c on S~^ is 
I (and 0 on S~). Pushing away coordinate Xcr(i), the sum of c on S'+ reduces 
to the extremity condition in dimension d — I. The induction hypothesis says 
that X must belong to some edge linearly independent of H . As a consequence 
X must be a vertex. □ 



4 Algorithms on Timed Polyhedra 

In this section we describe operations on timed polyhedra using the extreme 
vertices representation, assuming dimension d fixed, and using np to denote 
represent the number of (x, cr) pairs in the canonical decomposition of P, stored 
in some fixed order. 

Theorem 5 (Boolean operations). Complementation, symmetric difference 
and union (or intersection) of timed polyhedra A and B can be computed in time 
0(1), 0{nA + np) and 0{nAnB) respectively. 

Proof. Exactly as in Theorem 0 □ 

The two other operations needed for the verification of timed automata are 
the projection of a timed polyhedron on Xi = 0, which corresponds to resetting 
a clock variable to zero, and the forward time cone which corresponds to the 
effect of time passage which increases all clock variables uniformly. As in griddy 
polyhedra we will use the slicing operation for the computation. 

Definition 10 (Slice, Projection and Forward Time Cone). Let P be a 

timed polyhedron, z an integer in [0,m) and a a permutation. 
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— The (i, z) — a-slice of P is the d-dimensional polyhedron 

{^:Xi—z} 

— The i-projection of P is the {d — 1)- dimensional polyhedron 

%i(P) = {{xi,...,Xi-i,Xi+i,...Xd) ■■ 3z (xi,...,Xi-i,z,Xi+i,...Xd) e P}. 

— The forward time cone of P is the d-dimensional polyhedron 

= {x : > 0 X - tl e P}. 

These operations are illustrated in Figure [71 




Fig. 7. A timed polyhedron P (a), some of its slices and projections (b) and its forward 
time cone (c). 



Theorem 6 (Computation of Slice, Projection and Time Cone). Given 
a timed polyhedron P, 

— The computation of of Jf^{P) can he done in time 0(np). 

— The computation o/7r^i(P) can be done is time 0{n^). 

— The computation of P^ can be done is time 0{nff). 

Sketch of proof: The definition of Jf^{P) in terms of vertex-permutation pairs 
is Jf^{P) = P (1 {{x, a) : Xi = z} . Hence the result follows immediately from the 
identity (A 0 P) □ C = (A □ C) © (P □ C) which reduces the problem into np 
intersections of cones with a hyper-plane. For every i 

U 

z^[0,m] <7^n 

hence, since both projection and time cone distribute over union they can be 
written as 

MP) = U U MJUP)) 

z^[0,m] (tGU 
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and 

z£ [0,m] (T^IJ 

Projection of a slices is trivial and time cone of a slice is obtained by replacing 
every elementary simplex by a forward cone. □ 



5 Past, and Future Work 

In this paper, we introduced a new canonical representation scheme for timed 
polyhedra as well as algorithms for performing the operations required for reach- 
ability analysis of timed automata. To the best of our knowledge these results 
are original. The representation of polyhedra is a well-studied problem, but most 
of the computational geometry and solid modeling literature is concerned only 
with low dimensional polyhedra (motivated by computer graphics) or with con- 
vex polyhedra for which a very nice theory exists. No such theory exist for 
non-convex polyhedra (see, for example, and the references therein). 

The closest work to ours was that of [lA At)7f A At)8j . which we strengthened 
and generalized to arbitrary dimension in [IBM and extended from orthogo- 
nal to timed polyhedra in the present paper. The fact that non-convex polyhedra 
of arbitrary dimension can be represented using a 0-formula is not new (see for 
example a recent result in ) but so far only for griddy and timed polyhedra 
a natural canonical form has been found. 

We intend to implement this representation scheme and its corresponding 
algorithms, as we did for griddy polyhedra, and to see how the performance 
compares with other existing methods. Although the reachability problem for 
timed automata is intractable, practically, the manipulated polyhedra might 
turn out to have few vertices. In addition to the potential practical value we 
believe that timed polyhedra are interesting mathematical objects whose study 
leads to a nice theory. 
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Abstract. A family of permutations .A C (the symmetric group) is 
called min-wise independent if for any set X C [n] and any x G X, when 
a permutation tt is chosen at random in T we have 

Pr(min{7r(X)} = 7r(a;)) = |^. 

In other words we require that all the elements of any fixed set X have an 
equal chance to become the minimum element of the image of X under 

TT. 

The rigorous study of such families was instigated by the fact that such 
a family (under some relaxations) is essential to the algorithm used by 
the AltaVista Web indexing software to detect and filter near-duplicate 
documents. The insights gained from theoretical investigations led to 
practical changes, which in turn inspired new mathematical inquiries 
and results. 

This talk will review the current research in this area and will trace the 
interplay of theory and practice that motivated it. 
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Abstract. This paper initiates the study of testing properties of di- 
rected graphs. In particular, the paper considers the most basic property 
of directed graphs - acyclicity. Because the choice of representation af- 
fects the choice of algorithm, the two main representations of graphs are 
studied. For the adjacency matrix representation, most appropriate for 
dense graphs, a testing algorithm is developed that requires query and 
time complexity of 0(l/e^), where e is a distance parameter independent 
of the size of the graph. The algorithm, which can probe the adjacency 
matrix of the graph, accepts every graph that is acyclic, and rejects, with 
probability at least 2/3, every graph whose adjacency matrix shonld be 
modified in at least e fraction of its entries so that it become acyclic. For 
the incidence list representation, most appropriate for sparse graphs, an 
12(1 lower bonnd is proved on the number of queries and the time 

required for testing, where V is the set of vertices in the graph. These 
results stand in contrast to what is known about testing acyclicity in 
undirected graphs. 



1 Introduction 



The Problem. Deciding whether a graph is acyclic is one of the basic algorith- 
mic questions on directed graphs. It is well known that this problem can be solved 
by depth first search in time linear in the size of the graph. A natural generaliza- 
tion of this problem is asking how close to acyclic is a given graph. That is, what 
is the minimum number of edges (or vertices) that should be removed from the 
graph so that there are no remaining directed cycles. This problem is known as 
the minimum feedback arc ( or vertex) set problem. Unfortunately, this problem is 
NP-hard |2Z! and even APX-hard [2S! ■ Consequently researchers have developed 
approximation algorithms in various settings (including studying the comple- 
mentary problem of the maximum acyclic subgraph) 



IKlIKUSHMilliieCigllil 



Testing Graph Properties. The field of Testing Graph Properties sug- 
gests an alternate framework with which to study the above problem, and this is 
the approach that we take in this paper. A property tester determines whether a 
graph G = (V, E) has a given property or is far from having the property. More 
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formally, Testing Graph Properties is the study of the following family of tasks. 
Let P be a predetermined graph property (such as acyclicity, connectivity, or 
3-colorability) . A testing algorithm for property P is given access to a graph G 
so it can query the incidence relationships between vertices. If G has property 
P then the algorithms should accept with probability at least 2/3. If many edge 
modifications should be performed so that G has the property, then the algo- 
rithm should reject with probability at least 2/3. The success probability of the 
algorithm can clearly be amplified by repetitions to be arbitrarily close to 1. 

We thus relax the task of deciding exactly whether the graph has the property, 
but expect of the algorithm to perform its task by observing as few vertices and 
edges in the graph as possible. Specifically, we are only willing to spend time that 
is sub-linear in or even independent of the size of the graph. Thus, in contrast 
to the standard notion of graph algorithms, property testing algorithms are not 
provided the whole graph and required to run in time polynomial in the size of 
the graph. Rather, they are provided access to the graph and are expected to 
run in sub- linear time. 

More concretely, in this paper we study the question of whether a graph G is 
acyclic or far from acyclic. If the graph is far from acyclic (that is, many edges 
should be removed so that no cycle remains), then the tester should reject; if the 
graph actually is acyclic, the tester should accept; if the graph is nearly acyclic, 
then the tester may answer either way. Thus, we excuse the tester from answering 
the most difficult instances correctly, but we require the tester to execute much 
more quickly then any exact decision algorithm. 

Alternate Notion of Approximation. In view of the above, property test- 
ing suggests an alternative notion of approximation that is related to the notion 
of dual approximation p4l25) . An approximation algorithm is a mechanism that 
trades accuracy for speed. Given an optimization problem that associates costs 
with solutions, the more standard notion of approximation is to find a solution 
that is close to the cost of the optimal solution. By “close,” we mean that the 
value found is within some multiplicative factor of the optimal cost. 

A property tester also trades accuracy for speed, but may use a different 
notion of distance. Specifically, distance is measured in terms of the number of 
edge insertions and deletions necessary to obtain a particular property (which, 
in particular, may be having a solution with a given cost) . 

The following example illustrates the two notions of distance. A graph G 
might be nearly 3-colorable in the sense that there is a 3-colorable graph G' at 
small edit distance to G, but far from 3-colorable in the sense that many colors 
are required to color G. Alternatively, a graph G might be nearly 3-colorable 
in the sense that it is 4-colorable, but far from 3-colorable in the sense that no 
graphs having small edit distance to G are 3-colorable. Both notions are natural 
and the preferred choice depends on the context. In some cases the two notions 
coincide (e.g., Max-Gut ini)j3 



^ In the case of acyclicity, the notion of distance in the context of property testing 
and the cost approximated in the minimum feedback arc set problem are in fact the 
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Applications. A graph-property tester may be employed in several contexts. 
(1) A fast property tester can be used to speed up the slow exact decision 
procedure as follows. Before running the slow decision procedure, run the tester. 
If the fast inexact tester rejects, then we know with high confidence that the 
property does not hold and it is unnecessary to run the slow tester. In fact, it is 
often the case that when the testing algorithm rejects, it provides a witness that 
the graph does not have the property (in our case, a cycle). If the fast inexact 
tester accepts, then a slow exact decision procedure will determine whether the 
property is close to holding or actually holds. (2) There are circumstances in 
which knowing that a property nearly holds is good enough and consequently 
exact decision is unnecessary. (3) It may even be NP-hard to answer the question 
exactly, and so some form of approximation is inevitable. 

Impact of Graph Representation. We now define precisely the notion of 
distance and how the tester actually probes the graph. In fact, there are two 
traditional representations for graphs, adjacency matrices and incidence lists. 
The choice of representation strongly affects these issues, as well as the applicable 
algorithmic techniques. We summarize the properties of each representation here. 

• Adjacency-Matrix Model. Goldreich, Goldwasser, and Ron PDj consider the 
adjacency-matrix representation of graphs, where the testing algorithm is 
allowed to probe into the matrix. That is, the algorithm can query whether 
there is an edge between any two vertices of its choice. In the undirected 
case the matrix is symmetric, whereas in the directed case it may not be. In 
this representation the distance between graphs is the fraction of entries in 
the adjacency matrix on which the two graphs differ. By this definition, for 
a given distance parameter e, the algorithm should reject every graph that 
requires more than e • |Vp edge modifications in order to acquire the tested 
property. This representation is most appropriate for dense graphs, and the 
results for testing in this model are most meaningful for such graphs. 

• (Bounded- Length) Incidence-Lists Model. Goldreich and Ron [21 j consider 
the incidence-lists representation of graphs. In this model, graphs are repre- 
sented by lists of length d, where c? is a bound on the degree of the graph. Here 
the testing algorithm can query, for every vertex v and index i € {1, . . . , d}, 
which is the Tth neighbor of v. If no such neighbor exists then the answer 
is ‘O’. In the case of directed graphs each such list corresponds to the out- 
going edges from a vertex|^ Analogously to the adjacency matrix model, the 
distance between graphs is defined to be the fraction of entries on which 
the graphs differ according to this representation. Since the total number of 



same. However, the two problems do differ mainly because the former is a “promise” 
problem and so nothing is required of the algorithm in case the graph is close to 
acyclic. 

^ Actually, the lower bound we prove for this model holds also when the algorithm 
can query about the incoming edges to each vertex (where the number of incoming 
edges is bounded as well) . We note that allowing to query about incoming edges can 
make testing strictly easier. 
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incidence-list entries is d ■ |V|, a graph should be rejected if the number of 
edges modifications required in order to obtain the property is greater than 
e-d|V|.0 

Testing Directed Graphs. This paper studies property testing for directed 
graphs. Typically, a given problem on a directed graph is more difficult than the 
same problem on an undirected graph. In particular, testing acyclicity of undi- 
rected graphs in the adjacency-matrix representation is straightforward: Assume 
e ^ (or otherwise it is possible to make an exact decision in time polynomial 
in 1/e by looking at the whole graph). The basic observation is that any graph 
having at most e|Vp edges is nearly acyclic (because it is e-close to the empty 
graph), while only very sparse graphs (having at most |V| — 1 < f|Vp edges) 
may be acyclic. Hence the algorithm can estimate the number of edges in the 
graph by sampling, and accept or reject based on this estimate. Testing acyclic- 
ity of undirected graphs in the incidence-list (bounded-degree) representation, 
is more interesting and is studied in m- However, this result does not extend 
to testing directed graphs. 

Our Results. We first consider the problem of testing acyclicity in the adja- 
cency matrix representation. We describe a tester whose query complexity and 
running time are independent of the size of the graph and polynomial in the 
given distance parameter e. Specifically, the query complexity and running time 
are both O ^ , As mentioned above, the algorithm works by randomly 

and uniformly selecting a set of 0(l/e) vertices, and verifying whether the small 
subgraph induced by these vertices is acyclic. Thus, an acyclic graph is always 
accepted, and for all rejected graphs, the algorithms provides a “witness” that 
the graph is not acyclic in the form of a short cycle. A key (combinatorial) lemma 
used in proving the correctness of the algorithms shows that a graph that is far 
from acyclic contains a relatively large subset of vertices for which every vertex 
in the subset has many outgoing edges extending to other vertices in the subset. 
We then show that a sample of vertices from within this subset likely induces a 
subgraph that contains a cycle. 

We next turn to the problem of acyclicity testing in the incidence-lists rep- 
resentation. We demonstrate that the problem is significantly harder in this set- 
ting. Specifically, we show an f?(|V|^/^) lower bound on the number of queries 
required for testing in this setting. To prove the bound we define two classes of 
directed graphs - one containing only acyclic graphs and one containing mostly 
graphs that are far from acyclic. We show that J7(|V|^/^) queries are required in 
order to determine from which class a randomly selected graph was chosen. 

It appears that the techniques used in testing undirected graphs in the 
incidence-lists representation, cannot be applied directly to obtain an efficient 
acyclicity testing algorithm for directed graphs. Consider a graph that contains 

® A variant of the above model allows the incidence lists to be of variant lengths m- 
In such a case, the distance is defined with respect to the total number of edges in 
the graph. 
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a relatively large subgraph that is far from acyclic, but such that many edges 
connect this subgraph to acyclic regions of the graph. By our lower bound, 
any testing algorithm should perform many queries concerning edges within this 
subgraph (to distinguish it from the case in which the subgraph is acyclic) . How- 
ever, both exhaustive local searches and random walks will “carry the algorithm 
away” to the acyclic regions of the graph. It would be interesting to develop an 
acyclicity tester that uses 0(|V|^““) queries, for any a > 0. 

Testing Other Properties of Directed Graphs. As noted in |2Ql Sub- 
section 10.1.2], some of the properties studied in that paper (in the adjacency 
matrix model) have analogies in directed graphs. Furthermore, the algorithms 
for testing these properties can be extended to directed graphs. In particular, 
these properties are defined by partitions of the vertices in the graph with cer- 
tain constraints on the sizes of the sets in the partition as well as on the density 
of edges between these sets. The techniques of ra (for testing properties of undi- 
rected graphs in the adjacency-matrix representation), can also be extended to 
testing properties of directed graphs (Private communications with Noga Alon). 

Another basic property of directed graphs, is (strong) connectivity. Namely, 
a directed graph is strongly connected if there is a directed path in the graph 
from any vertex to any other vertex. Testing this property is most meaningful 
in the incidence-lists model, as every graph can be made strongly connected by 
adding at most 27V directed edges. The undirected version of this problem is 
studied in ED, where an algorithm having query and times complexities 0(1 /e) 
is presented and analyzed. As we show in the long version of this paper E]j this 
algorithm can be extended to the directed case if the algorithm can also perform 
queries about the incoming edges to each vertex. Otherwise, (the algorithm can 
only perform queries about outgoing edges), a lower bound of y/N on the number 
of queries can be obtained. 

Related Work. Property testing of functions was first explicitly defined in |32| 
and extended in pm. Testing algebraic properties (e.g., linearity or being a poly- 
nomial of low-degree) plays an important role in the settings of Program Testing 
(e.g., [ I Of, 321,3 1 j ) and Probabilistically-Checkable Proof systems (e.g., |ZEC3E0)- 
As mentioned previously, the study of testing graph properties was initiated 
in PDj, where, in particular, the adjacency-matrix model was considered. Some 
of the properties studied in that work are bipartitness, fc-colorability, having a 
clique of a certain size and more. In testing properties of graphs represented 
by their incidence lists was considered. Some of the the properties studied in that 
work are /c-connectivity and acyclicity. 

Ergun et. al. urn give a poly(l/e)-time algorithm for testing whether a rela- 
tion is a total order. This can viewed as a special case of testing acyclicity in the 
adjacency-matrix model, where it is assumed that a directed edge exists between 
every two vertices. 

Other papers concerning testing of graph properties and other combinatorial 
properties include |22ll2l28ll9llH . Recently, Alon et. al. P presented a general 
family of graph properties that can be tested using a sample that is independent 
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of the size of the graph (though the dependence on the distance parameter e is 
somewhat high) . In [2| it is shown that all properties defined by regular languages 
can be tested using a sample of size almost linear in the distance parameter. 

As mentioned previously, the related minimum feedback set problem is APX- 
Hard and its complementary, maximum acyclic subgraph is APX-Complete 
The former can be approximated to within a factor of 0(log |V| loglog |V|) 
m, and the latter to within 2/(1 + l7(l/-\/A)) where A is the maximum degree 
PE3|. Variants of these problems are studied in the following papers 
Perhaps the result most closely related to our work is that of Frieze and Kan- 
nan HS|. They show how to approximate the size of the maximum acyclic sub- 
graph to within an additive factor of e|Vp, in time exponential in 1/e. In com- 
parison to their result, we solve a more restricted problem in time polynomial in 
1/e as opposed to exponential. In addition, since our analysis is tailored to the 
particular problem (as opposed to theirs which follows from a general paradigm 
that gives rise a family of approximation algorithms), it may give more insight 
into the problem in question. 

With the current trend of increasing memory and storage sizes, the problem of 
examining large structures in sublinear time has been studied in other contexts. 
For example. Gibbons and Matias jitillY) develop a variety of data structures 
that glean information from large databases so they can be examined in sublinear 
time. 

2 Definitions 

Let G = (V,E) be a directed graph, where |V| = N, and E C V x V consists 
of ordered pairs of vertices. For a given set of vertices U C V, let G(U) denote 
the subgraph of G induced by U, and for any two sets of vertices Ui and U 2 , let 
E(Ui,U 2 ) denote the set of edges going from vertices in Ui to vertices in U 2 . 

r]f>f 

That is, E(Ui,U 2 ) = {(ui,U 2 ) £ E : G Ui, U 2 £ U 2 }. 

We say that a graph G is acyclic if it contains no directed cycles. In other 
words, G is acyclic if and only if there exists a (one-to-one) ordering function 
/) : V !->■ {1, . . . , N}, such that for every {v, u) £ E, (j){v) < 4>{u). We say that an 
edge (u, m) £ E is a violating edge with respect to an ordering if 4>{v) > 4>{u). 

We consider two representations of (directed) graphs. In the adjacency-matrix 
representation, a graph G is represented by a 0/1 valued N x N matrix Mq, 
where for every pair of vertices £ V, Mg[u,u] = 1 if and only if {u,v) £ E. 
This representation is more appropriate for dense graphs than sparse graphs, 
because with sparse graphs the representation entails a large space wastage. In 
the incidence-lists representation, a graph G is represented by an iV x d matrix 
Lq, (which can be viewed as N lists), where d is a bound on the outdegree of 
each vertex in G. For u £ V and i £ [d], Lg[u,*] = u, if an only if the Fth edge 
going out of V is directed to u. If such an edge does not exist then the value of 
the entry is ‘O’. 

For any 0 < e < 1, a graph G in either of the two representations, is said 
to be e-close to acyclic, if at most an e-fraction of entries in G’s representation 
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need to be modified to make G acyclic. If more than an e fraction of entries 
must be modified, than it is e-far from acyclic. Because the adjacency-matrix 
representation has size this means that a graph G in the adjacency-matrix 
representation is e-close to being acyclic if at most e • edges can be removed 
to make G acyclic. Because the incidence-lists representation has size d ■ N , the 
number of edges that should be removed in this representation is at most e • dN . 
Note that a graph is e-close to acyclic if and only if there exists an order function 
^(•), with respect to which there are at most efV^ (similarly, e • dN), violating 
edges. We say in this case that ^(-) is e-good. 

A testing algorithm for acyclicity is given a distance parameter e, and oracle 
access to an unknown graph G. In the adjacency-matrix representation this 
means that the algorithm can query for any two vertices u and v whether (u, v) G 
E. In the incidence-lists representation this means that the algorithm can query, 
for any vertex v and index z G [d], what vertex does the z’th edge going out 
of V point to. If the graph G is acyclic then the algorithm should accept with 
probability at least 2/3, and if it is e-far from acyclic then the algorithm should 
reject with probability at least 2/3. 

3 Testing Acyclicity in the Adjacency-Matrix 
Representation 

We next give our algorithm for testing acyclicity when the graph is represented by 
its adjacency matrix. Similarly to several previous testing algorithms in the (un- 
directed) adjacency-matrix model, the algorithm is the “natural” one. Namely, 
it selects a random subgraph of G (having only 0{l/e) vertices), and checks 
whether this subgraph is acyclic (in which case it accepts) or not (in which case 
it rejects). Observe that the sample size is independent of the size of G. 

Acyclicity Testing Algorithm 

1. Uniformly and independently select a set of 0(log(l/e)/e) vertices and de- 
note the set by U. 

2. For every pair of vertices vi,V 2 G U, query whether either (vi,z; 2 ) G E or 
{v 2 ,vi) G E, thus obtaining the subgraph G(U) induced by U. 

3. If G(U) contains a cycle, then reject, otherwise accept. 

Theorem 1 The algorithm described above is a testing algorithm for acyclicity 
having query and time complexity 0(l/e^). Furthermore, if the graph G is acyclic 
it is always accepted, and whenever the algorithm rejects a graph it provides a 
certificate of the graph’s cyclicity (inform of a short cycle). 

The bound on the query and time complexity of the algorithm follows directly 
from the description of the algorithm. In particular, there are 0(log^(l/e)/e^) 
pairs of vertices in U, which limits the number of queries made as well as the 
number of edges in G(U). To verify whether G(U) is acyclic or not, a Breadth- 
First-Search (BFS) can be performed starting from any vertex in G(U). The 
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time complexity of this search is bounded by the number of edges in G(U), as 
desired. The second statement in the theorem is immediate as well. It remains 
to be shown that every graph that is e-far from being acyclic is rejected with 
probability at least 2/3. 

Proof Idea. We prove Theorem^using two lemmas. Lemma |2| shows that if a 
graph G is far from acyclic, then G contains a relatively large set W such that 
each vertex in W has many outgoing edges to other vertices in wEI Lemma 01 
shows that if we uniformly select a sufficient number of vertices from W, then 
with probability at least 9/10 the underlying graph induced by these vertices 
contains a cycle. To prove Theorem ^ we show that with sufficiently high prob- 
ability, a large enough sample of vertices in G contains enough vertices in W to 
find a cycle with the desired probability. 

Definitions. To formalize the above ideas, we use the following definitions. 

def 

For any vertex u S V, let 0(u) = {u : {v,u) G E} be the set of u’s outgoing 
edges. Given a set W C V, we say that v has low outdegree with respect to W, if 
|0(u) n W| < |fV; otherwise it has high outdegree with respect to W. 

Lemma 2 If G is e-far from acyclic, then there exists a set W C V, such that 
|W| > and every v G W has high outdegree with respect to W. 



Lemma 3 Let W Q Y he a set of vertices such that for every v G W, |0(u) fl 
W| > e'|W| for some e' > 0. Suppose we uniformly and independently select 
l7(log(l/e')/e') vertices in W. Then with probability at least 9/10 (over this 
selection) the subgraph induced by these vertices contains a cycle. 

We prove the two lemmas momentarily, but first we show how Theorem E 
follows from the two lemmas. 

Proof of Theorem ^ If G is acyclic, then clearly it always passes the 
test. Thus, consider the case in which G is e-far from acyclic. By Lemma 0 
there exists a set W C V, such that |W| > \/\N, and every v G W has high 

outdegree with respect to W. Let a |W|/A^ be the fraction of graph vertices 
that belong to W, so that a > By applying a (multiplicative) Ghernoff 

bound we have that for every integer m > 12, with probability at least 9/10, a 
uniformly and independently selected sample of 2m fa vertices contains at least 
m (not necessarily distinct) vertices in W (where these vertices are uniformly 
distributed in W). Assume this is in fact the case (where we account for the 
probability of error and set m below) . 

Let e' ^ so that by definition of a, for every v G W, |0(u) fl W| > e'|W|. 
By setting the quantity m to be 6>(log(l/e')/e'), and applying Lemma0 we ob- 
tain that conditioned on there being m elements in the sample that belong to W, 

^ In fact, as we showed in an earlier version of this paper, there exists a relatively 
large set W such that every vertex in W also has many incoming edges from other 
vertices in W. However, we no longer need this stronger property. 
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a cycle is observed with probability at least 9/10. Adding the two error proba- 
bilities, and noting that the total size of the sample is 0(m/a) = 0(log(l/e)/e), 
the theorem follows. □ 

Now that we have demonstrated how Theorem^] follows from LemmasE|and|2l 
we prove the lemmas themselves. 

Proof of Lemma We prove the contrapositive of the statement in the 
lemma: If such a set W does not exist, then the graph is e-close to being acyclic. 
In other words, we show that if for every subset Z C V having size at least y^N, 
there exists at least one vertex in Z having small outdegree with respect to Z, 
then the following holds. There exists an order (/ : V >-->• [iV], and a set of edges 
T of size at most efV^, such that the edges of T are the only violating edges with 
respect to (j). 

We define (j) and construct T in steps. At each step we select a vertex v 
for which (j) is not yet determined, and set the value of </(f). We maintain an 
index i (last), where initially i = N. At the start of a given step, let Z C V 
denote the set of vertices for which (p is yet undefined (where initially, Z = V). 
As long as |Z| > we do the following. Consider any vertex v that has 

low outdegree with respect to Z (where the existence of such a vertex is ensured 
by our (counter) assumption). Then we set (p{v) = £, decrease ^ by 1, and let 
T ^ T U {(w, u) G E : u G Z}. Hence, at each step, the size of T increases by at 
most ^N. 

Finally, when |Z| < s/^N, so that the vertices in Z may all have high out- 
degree with respect to Z, we order the vertices in Z arbitrarily between 1 and i, 
and add to T all (at most |Zp < flV^) edges between vertices in Z. Thus, the 
total number of edges in T is bounded by efV^, as desired. 

It remains to show that there are no other violating edges with respect to p. 
Namely, for every (v,u) G E \ T, it holds that cj){v) < <p{u). Consider any such 
edge (v,u) G E \ T. We claim that necessarily the value of 4> was first defined 
for u, implying that in fact (p{v) < 4>{u) (since the value I given by p decreases 
as the above process progresses). This must be the case since otherwise, if the 
value of (j) was first defined for v then the edge {v, u) would have been added to 

T, contradicting our assumption that (v, w) G E \ T. □ 

Proof of Lemma El Let m = -|- 1 be the number of vertices selected 

(uniformly and independently) from W, where c is a constant that is set below. 
We shall show that with probability at least 9/10 over the choice of such a sample 

U, for every vertex r; G U, there is another vertex u G U such that (u, u) G E. 
This implies that the subgraph induced by U has no sink vertex, and hence 
contains a cycle. 

Let the m vertices selected in the sample U be denoted vi, . . . ,Vm- For each 
index 1 < i < m, let £i be the event that there exists a vertex Vj such that 
(vi,Vj) G E. In other words, £i is the (desirable) event that vi has non-zero 
outdegree in the subgraph induced by U. We are interested in upper bounding 
the probability that for some i the event £i does not hold. That is, Pr [IJ™ i ^£i]- 
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Since the vertices in the sample are chosen uniformly and independently, and 
every vertex in W has outdegree at least e' • |W|, for each fixed i, 

Pr[-.fi] < < exp(-(m - l)e') = exp(-cln(l/e')) = (e')“° (1) 



By applying a probability union bound, 



Pr 



m 






< J^Prhf,] < 
2 = 1 



|^ c-ln(l/c') 




•(O 



— C 



( 2 ) 



Setting c to be a sufficiently large constant (say, c > 10), for any e' < 1/2 the 
above probability is at most 1/10 as required. □ 



4 Testing Acyclicity in the Incidence-Lists Representation 



In this section we give a lower bound of for testing acyclicity in the 

incidence-lists representation, when d and e are constant. This lower bounds 
holds even when the algorithm may query about the incoming edges to each 
vertex (where the indegree of each vertex is also at most d). 

Theorem 4 Testing Acyclicity in the incidence-lists representation with dis- 
tance parameter e < ^, requires more than ^ queries. 

Due to space limitations, the full proof appears in the long version of this 
paper jSj, and here we just sketch the idea. To prove the bound we define two 
classes of (directed) graphs, Q\ and Q 2 , each over N vertices, with degree bound 
d. All graphs in Qi are acyclic, and we show that almost all graphs in Q 2 are 
e-far from acyclic (for e = ^). We then prove that no algorithm can distinguish 
between a graph chosen randomly in Qi and a graph chosen randomly in C /2 in 
less than a ■ queries, for 0 < a < The classes are defined as follows. 

• Each graph in consists of K = layers, Li,...,Lx, each having 

M = v0]-t;ic0g^ From each layer L/ there are d • |L/ = d - M edges going 

to layer Li+i, where these edges are determined by d matchings between the 
the vertices in the two layers. 

• Each graph in O 2 consists of two equal-size subsets of vertices. Si and S 2 . 
There are d ■ ^ edges going from S 2 to Si, and d ■ ^ edges going from Si to 
S 2 . The two sets of edges are each defined by d matching between Si and S 2 . 

In both cases, every edge has the same label at both its ends (determined by the 
matching) . 
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Computing the Girth of a Planar Graph * 
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Abstract. The girth of a graph G has been defined as the length of 
a shortest cycle of G. We design an log n) algorithm for finding 

the girth of an undirected n- vertex planar graph, giving the first o(n^) 
algorithm for this problem. Our approach combines several techniques 
such as graph separation, hammock decomposition, covering of a pla- 
nar graph with graphs of small tree-width, and dynamic shortest path 
computation. We discuss extensions and generalizations of our result. 



1 Introduction 

Given an (unweighted) graph G, the length of a path p in G is the number of 
the edges in p. The girth of G (denoted by girth{G)) was defined by Harary [15] 
as the length of a shortest cycle in G, or infinity if G has no cycle. The girth 
is a basic combinatorial characteristic of graphs and its relations to other graph 
properties have been extensively studied. In particular, Erdos [12], Lovasz [19], 
Bollobas [4] , Cook [5] , and others studied the relationship between the girth and 
the chromatic number of a graph. Thomassen [22] and Mader [20] studied the 
relationship between the girth and the existence of certain type of minors of the 
graph. Other results relate the girth of the graph to the minimum or the average 
degrees of its vertices, its diameter, its maximum genus, and its connectivity (see 
[ 6 ]). 

The first efficient algorithm for computing the girth of graph was given by 
Itai and Rodeh [16], who describe an 0{nm) algorithm for computing the girth 
of a general n-vertex m-edge graph G. They also design an O(n^) algorithm 
for computing the girth within an additive error of one. Finding shortest cycles 
of even and odd lengths have been studied by Monien [21] and Vazirani and 
Yannakakis [23] . There are numerous results on finding a cycle of a given length 
in general or special graphs; see e.g. [1] and [2] for recent results and references. 

In the case of planar graphs Itai and Rodeh [16] give an 0{n) algorithm for 
finding a triangle in the graph, if one exists (and thus solves the girth prob- 
lem for planar graphs in case girth(G) < 3 in 0{n) time). Their results were 
generalized by Eppstein [11], who developed an 0{n) algorithm for finding the 
girth of a planar graph G provided girth(G) = 0(1). (His algorithm, however, 
is superexponential with respect to girth{G).) 

* This work was partially supported by the EPA grant R82-5207-01-0, EPSRC grant 
GR/M60750, and RTDF grant 98/99-0140. 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 821-831, 2000 
© Springer-Verlag Berlin Heidelberg 2000 
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Note that, for an embedded planar graph G, the size of each face of G is 
an upper bound on the girth of G. However, the shortest cycle of G does not 
necessarily needs to be a face of G. 

In this paper we present an algorithm that finds the girth of an arbitrary 
undirected n-vertex planar graph in 0(n®/"‘logn) time. 

Our approach makes use of recently developed fast dynamic algorithms for 
computing shortest paths in planar graphs with small face-on- vertex cover [7]. 
If G has an appropriately large girth, then we show that the shortest cycle in 
G can be computed by combining separator based divide-and-conquer with the 
dynamic shortest path algorithm from [7]. If, on the other hand, the girth of G 
is very small, then we can decompose G into subgraphs of small diameter and 
such that any shortest cycle of G is contained also in one of the subgraphs of the 
decomposition. Therefore, we can search for a shortest cycle in each subgraph 
independently, which will be less expensive in terms of total computation time. 

The rest of the paper is organized as follows. In Section 2, we give some 
definitions and review basic facts related to graph separators. In Section 3 and 
Section 4, we describe algorithms efficient for graphs of large or small girth, 
respectively. In Section 5, we describe the algorithm for the case of general 
graphs. In the last section we discuss some related results and open problems. 



2 Preliminaries 

By G = {V, E) we denote in the rest of this paper an undirected connected graph 
with n vertices. Given an edge e £ E and a set 5 C G, by G — e we denote the 
graph (V, E \ {e}) and by G — 5 we denote the graph {V \ S, E \ {S x S)). By 
deg(G) we denote the maximum degree of a vertex of G. 

If any edge e of G has a non-negative cost cost{e) associated with it, then 
the length of a path p of G is the sum of the costs of all edges of p. The distance 
between two vertices x and y of G is the minimum length of a path joining x and 
y. The single-source shortest path problem asks, given a source vertex s of G, to 
compute the distances between s and all other vertices of G. If G is planar, then 
the single-source shortest path problem for G can be solved in 0{n) time [17]. 

The graph G is planar if G can be embedded in the plane so that no two 
edges intersect except at a common endpoint. A planar graph of n vertices has 
at most 3n — 3 = 0{n) edges. A graph already embedded in the plane is a plane 
graph. 

A separator of G is a set of vertices whose removal leaves no connected 
component of more than n/2 vertices. If G is planar, then G has a separator of 
size 0(i/n) [18,9] and if G has genus g > 0, then G has a separator of 
vertices [8, 14]. In both cases, the corresponding separators can be found in 0{n) 
time. 

If a graph G has non-negative weights associated with its vertices, then a 
weighted separator of G is a set of vertices whose removal leaves no connected 
component of weight greater than half the weight of G. 
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For other standard definitions from graph theory see [6] or any other textbook 
on graphs. 



3 Finding shortest cycles in graphs of large girth 

In this section we assume that the girth of G is bounded from bellow by some 
constant 7 > 0. We will use the dynamic shortest path algorithm from [7] that 
runs faster for graphs of small face-on- vertex cover. 

A hammock decomposition of an n-vertex graph G was defined by Freder- 
ickson in [13] as a decomposition of G into certain outerplanar digraphs called 
hammocks. Hammocks satisfy the following properties: 

(i) each hammock has at most four vertices, called attachment vertices, shared 
with the rest of the graph; 

(ii) the hammock decomposition spans all the edges of G, i.e., each edge belongs 
to exactly one hammock; and 

(iii) the number of hammocks produced is the minimum possible (within a 
constant factor) among all possible decompositions. 

Frederickson showed in [13] that hammock decompositions of a small cardi- 
nality can be computed fast for planar graphs if a good approximation of the 
minimum cardinality of a face-on-vertex cover is known. 

Theorem 1. [13] Let G he an n-vertex planar graph and let f(G) be the min- 
imum number of faces (over all embeddings of G in the plane) that cover all 
vertices of G. Then G can he decomposed into 0(f(G)) hammocks in 0(n) time. 

The results of Frederickson [13], Djidjev et al. [7], and others have shown 
that several shortest paths problems can be solved more efficiently for certain 
classes of graphs with hammock decompositions of a small cardinality. 

Our goal is to make use of the following shortest path algorithm based on 
the idea of a hammock decomposition. 

Theorem 2. [7] Let G be an n-vertex planar digraph with nonnegative edge 
costs and assume that G has a hammock decomposition of cardinality q. There 
exists an algorithm for the dynamic shortest path problem on G with the following 
performance characteristics: 

(i) preprocessing time and space 0(n); 

(ii) query time for computing the distance between any two vertices 0(\ogn-\-q); 

(iii) time for updating the data structure after any edge cost modification or 
edge deletion O(logn). 

The following algorithm for finding the girth is based on Theorem 1, Theo- 
rem 2, and the observation that the size of each face of a plane graph is not less 
than the girth of that graph. 
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First we transform G into a planar graph G' of degree at most 3 and 0(m) 
vertices, where m = 0{n) is the number of edges of G, by replacing each vertex 
of G of degree A: > 3 by a balanced binary tree with k leaves as illustrated on 
Figure 1. Assign a cost cost{e) to any edge e of the tree equal either to one, if e 
is incident to a leaf of the tree, or zero, otherwise. This transformation has the 
following properties: 

(i) The edges of G correspond to those edges of G' that have weight one; 

(ii) G and G' have the same number of faces; 

(hi) Any path of G of length (i.e. number of edges since G is unweighted) I is 
transformed into a path of weighted length I and at most llogn edges. 

Next we apply to G' the following algorithm. 



Algorithm Large_Girth 

{Finds the girth of a biconnected plane graph G with n vertices, / faces, 
maximum vertex degree 3, and nonnegative costs cost{-) on its edges) 

1. Find a hammock decomposition of cardinality 0{f) of G using the 
algorithm from Theorem 1. 

2. Preprocess G for shortest path queries using the preprocessing algo- 
rithm from Theorem 2. 

3. Construct a separator 5 of G of size 0(i/n) that divides G into 
components of at most n /2 vertices each. 

4. For each resulting component A, find the length of a shortest cycle 
in K by applying this algorithm recursively. 

5. For each edge e incident to a vertex of S compute the length c(e) of 
the shortest cycle in G containing e as follows. 

(a) Change the cost of e to +oo (which has the effect of deleting e 
from G) by using the update algorithm from Theorem 2. 

(b) Compute the distance c'(e) between the endpoints of e in the 
modified graph using the query algorithm from Theorem 2. 

(c) Assign c(e) := c'(e) +cost{e), where cost{e) denotes the original 
cost of e. 

(d) Assign to e its original cost by using the update algorithm from 
Theorem 2. 

6 . Return the minimum length of a cycle in G by combining the results 
from Steps 4 and 5. 

Now we analyze the correctness of the algorithm and its running time for 
7 = 0 (n^/^“®) for some e > 0 , since the value of the parameter 7 that will be 
chosen in Section 5 will satisfy that condition. 




Computing the Girth of a Planar Graph 



825 




Fig. 1. Reducing the maximum vertex degree. 



Lemma 1 . If G is a biconnected planar graph G of degree three and girth(G) > 
7 , then Algorithm Large_Girth eomputes the girth of G in /^) time, 

assuming 7 = for some e > 0. 

Proof: Correetness: Assume that c is a shortest cycle of G. If c does not contain 
a vertex of S, then c will belong to some component K considered in Step 4. K 
can not contain a cycle shorter than c since otherwise c will not be a shortest 
cycle of G. Hence the length of c will be correctly computed in Step 4. 

If c contains a vertex from S, then c will contain an edge e incident to a 
vertex from S that is considered in Step 5. Since c is a shortest cycle of G, then 
c — e will be a shortest path between the endpoints of e in G — e (Figure 2). 
Hence in that case the length of c will be correctly computed in Step 5. 

Time eomplexity: The time for Steps 1, 2, 3, and 6 is 0{n). The number of 
iterations of Step 5 is 0(i/n), since each vertex of G (and thus each vertex of S) 
has degree 0(1). The time for each iteration of Step 5 is dominated by the time of 
Step 5 (b), which time, according to Theorem 2 and Theorem 1, is 0(logn + /). 
Hence the time T{n,f) for Algorithm Large_Girth satisfies the recurrence 

T(l,/) = 0(1), 

T(n,/) < max{T(n/2,/i) +T(n/ 2 ,/ 2 ) | /i + /2 < /} 

+ 0{n) + 0{\/n{f + \ogn)), for n > 1 . 

For n > 1, one can represent T{n,f) as 

T(n,/)=Ti(n)+T 2 (n,/), 



where 



Ti{n) < 2Ti(n/2) + 0{n) and 
T 2 (n,/) < max{T 2 (n/ 2 ,/i) +T 2 (n/ 2 ,/ 2 ) | /i + /2 < /} 
+ 0(v^-/). 
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X a separator vertex 

• — • — • shortest path p in G-e between v and w 

Fig. 2. An Illustration to the proof of Lemma 1. The cycle p + e is a shortest cycle 
containing e. 



By unfolding, we have for the case n > 1 

T 2 (n,f) < c(i/n • / + /i + 1/^72 • /a) + • • •) 

= cf iVn + ^/^+ ■ ■ ■) =c/ 0 (i/n) = 0{f^/n). 

Clearly Ti(n) = 0(n log n) and hence T{n,f) = 0(/i/n + nlogn). 

Finally, since G is biconnected each face of G is a cycle, which implies that 
the size of each face is not less than 7 . Moreover, each edge of G belongs to 
exactly two faces of the embedding and hence / < 2 • \E{G)\ /7 = 0{n/j). 

Thus T{n,f) = 0{f^yn + nlogn) = 0 (n^/^/ 7 ) for 7 = □ 

4 Finding shortest cycles in graphs of small girth 

In this section we assume that the girth of the input graph is smaller than certain 
parameter 7 whose value will be determined in Section 5. For the proof of the 
next lemma we use a technique developed by Baker [3] and Eppstein [11]. 

Lemma 2. Let G be an n-vertex planar graph and let d be any integer. Then we 
ean find in 0(n) time a set of subgraphs Gi of G with the following properties: 

(i) The sum of the sizes of all subgraphs Gi is 0(n); 

(a) Every subgraph of Gi has a separator of size 0(d); 

(Hi) Any eyele of length at most Ad is eontained in some of the subgraphs Gi. 
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Gi 

Fig. 3. An Illustration to the proof of Lemma 2. 



Proof: Construct a breadth-first spanning tree T of G and divide the vertices 
of G into levels according to their distance to the root of T. Define the graph G, 
to be the graph induced by the vertices on levels from id to {i + 3)d — 1. 

Any vertex on a level between id and {i + l)d — 1 will belong only to the 
subgraphs G,_ 2 , G,_i and G, (if they exist). Thus any vertex of G will belong 
to at most 3 of the subgraphs G,. Property (i) follows. 

In order to prove (ii), for each graph G, define a new vertex u, and connect 
it to all vertices on level id. The resulting graph G( has a breadth first spanning 
tree of radius r = 3d. According to a lemma from [9], each weighted planar 
graph with a breadth-first spanning tree of radius r has a weighted separator of 
no more than 3r -I- 1 vertices one of which is the root of the tree. Let H be any 
subgraph of G,. Assign a weight 1 to each vertex of H and a weight 0 to each 
vertex of G( that is not in H. By applying the lemma from [9] on the weighted 
version of G', we find a separator of H of size at most 9d = 0{d). 

Finally, for proving that condition (iii) holds, we note that the endpoints of 
any edge of G belong either to the same level, or to two consecutive levels of 
G. Let c be a cycle of length not exceeding 4d (Figure 3). Then the set of the 
levels of the vertices on c form an interval [a, b] of integers and let u be a vertex 
on level I = \{a + b)/2\. Then each vertex on c is at distance at most d from v. 
Hence, if the level I is between levels {i + l)d and {i + 2)d — 1, then all vertices 
from c will belong to G,. Thus requirement (iii) is also satisfied. □ 

Next we show how to compute the length of a shortest cycle efficiently for 
weighted graphs with small separators using divide-and-conquer. 
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Algorithm Small_Separator 

{Finds the girth of a graph G with nonnegative costs cost{-) on 
its edges if G has 0{t separator and degree{G) = 0(1)} 

1. Construct a separator 5 of G of size 0(r) dividing the graph into 
components of no more than n/2 vertices each, where n denotes the 
number of vertices of G. 

2. For each component K of the partition of G find the length of the 
shortest cycle in K by this algorithm recursively. 

3. For each edge e incident to a vertex from S find the length c(e) of 
the shortest cycle in G containing e as follows. 

(a) Compute the distance c' (e) between the endpoints of e in G — e 
by running the linear time single-source shortest paths algorithm 
from [17] on G — e with source any of the endpoints of e. 

(b) Assign c(e) := c'(e) -I- cost{e). 

4. Return the minimum length of a cycle found in Step 2 or Step 3. 

Lemma 3. Algorithm Small_Separator computes the girth of G in 0(rn log n) 
time, assuming the tree-width of G is no more than t. 

Proof: Correctness. Let c be a shortest cycle in G. If c contains a vertex of S, 
then its length will be computed in some iteration of Step 3. Otherwise c will 
be contained in some component of G — 5 and its length will be computed in 
Step 2. 

Time complexity. The time needed for Step 1 is 0(n), where n is the number of 
vertices of G [18]. Since each vertex of S has degree 0(1), then the number of 
iterations of Step 3 is 0(]5j) = 0(r). The time for each iteration is dominated 
by the time for running the shortest paths algorithm in Step 3(a), which is 
0(n). This implies 0(rn) time bound for Step 3. Thus the total time T(n) for 
Algorithm Small_Separator satisfies the recurrence 

T(1) = 0(1), 

T(n) < 2T(n/2) + 0(rn), for n > 1, 

whose solution is T(n) = 0(rn log n). □ 

In a preprocessing step, we transform G into a graph G' of maximum ver- 
tex degree three, where any edge e of G' has a cost cost{e) equal to 1 or 0 
depending on whether e corresponds to an edge of the original graph or not 
(see Figure 1). Then we apply to G' the following algorithm which is based on 
Algorithm Small_Separator and Lemma 2. 
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Algorithm Small_Girth 

{Finds the girth of an n-vertex planar graph G assuming girth(G) < 7 } 

1. Apply Lemma 2 on G with I = [ylogn]. Let G\,...,Gs be the 
resulting subgraph cover of G. 

2. For each Gi,i = 1, ... ,s, find the length c, of the shortest cycle in G, 

(cj = 00 if Gj is acyclic) by applying Algorithm Small.Separator. 

3. Return girth(G) := minjc, | i = 1, . . . ,s). 

Lemma 4. Algorithm Small_Girth finds the girth of a planar graph G in 0(^n log^ n) 
time, assuming girth{G) < 7 

Proof: The correctness follows from Lemma 2 and the correctness of Algorithm 
Small_Separator. The time for Step 1, and Step 3 is 0{n). By Lemma 3, the 
time for computing c, in Step 2 is 0 ( 7 log nn, log n,), where n, is the number of 
the vertices of G,, since by Lemma 2 each graph G, has a separator of size 0{l), 
where I = ylogn. Thus the total time for Step 2 is 



S S 

^ 0 ( 7 log nrij log rij) < 0 ( 7 log^ n) 

= O (7 log^ n)0{n) = O {'jn log^ n) . 

□ 



5 The general case 

In the case of an arbitrary planar input graph G we find the girth of G simply 
by applying one of the algorithms from the previous two sections depending on 
the maximum face size of a given embedding of G. 



Algorithm Find_Girth 

{Finds the girth of an arbitrary n-vertex planar graph G) 

1. If G is not biconnected, then compute its biconnected components 
and apply the remaining steps of this algorithm on any of the bicon- 
nected components (since each cycle in a graph belongs to exactly 
one of its biconnected components). 

2. Transform G into a weighted planar graph G' with n' = 0{n) vertices 
and of a maximum vertex degree three (as discussed in Section 3 and 
illustrated on Figure 1). 

3. Embed G' in the plane and find the maximum face size h of the 
embedding. 
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4. If /i = 0{v}^'^ / \ogn), then compute the girth of G by applying on 
G' Algorithm Small_Girth. 

5. Else (if h = / \ogn)) compute the girth by applying on G' 

Algorithm Large_Girth. 

Theorem 3. Algorithm Find_Girth finds the girth of an n-vertex planar graph 
in 0 (n®/"‘logn) time. 

Proof: The correctness of the algorithm follows from Lemma 4 and Lemma 1. 

li h = 0(n^/'^/\ogn), then the time of the algorithm will be determined 
by the time for running Algorithm Small_Girth. By Lemma 4, that time is 
0 ( 7 nlog^n), where 7 = girth(G). Since 7 < /i = 0{n^^^\ogn), then the time 
of the algorithm in this case is logn) n log^ n) = 0{n^^^\ogn). 

li h = / log n), then by Lemma 1 the time for running Algorithm 

Find.Girth will be 0 (n^/^/ 7 ) = 0{r^!‘^ j logn) ) = logn). □ 

6 Related problems 

The technique discussed here can be used to solve other problems related to 
shortest cycles in graphs. Algorithm Find_Girth can be used to find a shortest 
non-facial cycle of a planar graph in 0{nA!‘^ log n) time. Furthermore, using graph 
separation and divide-and-conquer, we can compute the girth of a directed graph 
in 0(n^/^) time. Finally, making use of the 0{^/gn) separator theorem for graphs 
of genus g > 0 [8,14,10], one can construct efficient algorithms for the above 
problems for graphs of genus g = o{n). We will describe algorithms for solving 
the above problems in the full version of the paper. 

This paper leaves several open problems, including the following: 

1. Construct an o(n^/^) time algorithm for computing the girth of a directed 
planar graph. 

2. Develop efficient algorithms for finding shortest cycles in graphs with arbi- 
trary nonnegative costs (lengths) on edges. 

3. Construct o{nm) algorithms for finding the girth of general graphs. 

It will be also interesting to implement the algorithm from this paper and 
test its practicality. 
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Abstract. We prove that all NP problems over the reals with addition 
and order can be solved in polynomial time with the help of a boolean 
NP oracle. As a consequence, the “P = NP?” question over the reals with 
addition and order is equivalent to the classical question. For the reals 
with addition and equality only, the situation is quite different since P is 
known to be different from NP. Nevertheless, we prove similar transfer 
theorems for the polynomial hierarchy. 



1 Introduction 

Just as in discrete complexity theory, the main goal of algebraic complexity the- 
ory is to prove superpolynomial lower bounds for certain “natural” problems. 
In several algebraic settings this goal has not been achieved yet. For instance, it 
is not known whether the resultant of two sparse univariate polynomials can be 
computed by straight-line programs of polynomial length (see [1 1 2j for a motiva- 
tion); the problem “VP = VNP?” in Valiant’s model of computation [ 1 , 31 1 4 ^ is 
still open; and the same is true of the “P = NP?” problem in the most interest- 
ing versions of the Blum-Shub-Smale model. It is not always clear whether these 
algebraic questions are easier than the well-known open questions from discrete 
complexity theory. Indeed, it was shown in that problems such as Knapsack 
can be solved in polynomial time on a real (multiplication-free) Turing machine 
under the hypothesis P = PSPACE. Therefore a superpolynomial lower bound 
(on the circuit size, or on the running time of a real Turing machine) for Knap- 
sack would imply a separation of P from PSPACE. In this paper we investigate 
similar questions for smaller complexity classes. Our main result is the following 
transfer theorem. 

Theorem 1. Pr^^^ = NPr^^^ if and only ifP = NP. 

This implies for instance that Knapsack is in Pr^^^ under the hypothesis P = NP, 
which is a weaker hypothesis than P = PSPACE. Here Pr^^^ stands for the class 
of decision problems that can be solved in polynomial time by parameter-free 
real Turing machines over the structure Kovs (he., the only legal operations are 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 832-^3 2000. 
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+, — and <). The main result of 0 was that = PARr^^^ is equivalent 

to P = PSPACE (PAR stands for “parallel polynomial time”). The complexity 
theory of Mods has been studied in |5I2| . More background on computation over 
the reals and other algebraic structures can be found in the textbooks pa. 

In this paper, real Turing machines will be parameter-free unless stated other- 
wise. Results for machines with parameters follow from those for parameter- free 
machines. For instance: 

Corollary 1. Pr^^^ = NPr^^^ if and only if P/poly = NP/poly. 

Our proof of Theorem P relies on Meyer auf der Heide’s construction of linear 
decision trees for point location in arrangements of hyperplanes m Roughly 
speaking, we show in Theorem P that his construction can be made uniform 
if a boolean NP oracle is available. This result is established in section P and 
complexity-theoretic consequences are drawn in section P (as a byproduct, we 
obtain the unexpected result that problems such as the real Traveling Salesman 
Problem or Knapsack are NPr^^^ -complete for Turing machine reductions). Here 
Theorem P is a key result: problems in NPr can be solved in polynomial 
time with the help of a boolean NP oracle. Theorem P its corollary, and the 
completeness results just mentioned then follow immediately. 

In the order-free structure M„g (where addition, subtraction and equality 
tests are the only operations allowed) the situation is quite different than in 
Mo„s since it is possible to prove the unconditional result Pr„„ yf NPr„^, as 
shown originally by Meer 0. It would be interesting to obtain other separation 
results in this structure. Unfortunately, for several questions (such as the collapse 
of the polynomial hierarchy PHr^^ and the separation of PHr^^ from PARr^^) 
this turns out to be impossible with current techniques: the transfer theorems 
in section Pshow that these questions are as hard as outstanding open problems 
from discrete complexity theory. 



2 Point Location in an Arrangement of Hyperplanes 

We first recall some terminology regarding arrangements of hyperplanes. Let 
H = {h\, . . . ,hm} be a set of hyperplanes in R”. We denote by hf and h~ the 
two open half-spaces defined by hi. For a point x G M", we set Zi{x) = 0 if 
X G hi and Zi{x) = 1 (respectively, —l)\ixGh'l (respectively, x G h~). We 
define t.p{x) = {zi{x), . . . , Zm{x)). The faces of the arrangement A{H) are by 
definition the classes of the equivalence relation x ^ y ^ = (/?(?/) on M". 

The dimension of a face is the dimension of its affine closure; a face of dimension 
0 is called a vertex, and a face of dimension n a cell. 

We can now state the problem. Let T-Ln be the set of all hyperplanes defined 
by equations with integer coefficients in {— . . . , 2*^"')}, where t is some fixed 
polynomial. We say that an algorithm solves the location problem (for the family 
of arrangements Ht = {'Hn)nen) if> on an input point {xi, . . . ,Xn) G M"", it 
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computes a system 

o ^ / My) <0 i = 1, . .. ,p 

\gj{y) = 0 j = 

made up of p + r = affine (in)equations with integer polynomial-size coef- 

ficients such that the set of points Ps of R" satisfying S is included in a face of 
A{TLn), and x G Ps- 

Theorem 2. The location problem for Ht is in FPg_^^^(NP) for any polynomial 
t G Z[X]. This means that a system locating the input point can be computed in 
polynomial time by a Turing machine over R.oi;s using a boolean NP oracle. 

Defining formally the model of “real Turing machine with a boolean oracle” 
used in this theorem would be tedious but completely straightforward. The idea 
is that such a Turing machine can only use the instructions “write-0” or “write- 
1” to write on its oracle tape. This ensures that the oracle query is a word of 
{0, 1}*, despite the fact that the other tapes of the Turing machine may contain 
arbitrary real numbers. 

Before proving the theorem we have to make an observation about (parameter- 
free) algorithms over the structure Rous- By running such an algorithm on the 
formal input (^i, . . . ,Xn) and taking all possible paths into account, it is clear 
that each test is of the form YM=i + o-n+i > 0 (a^ S Z). Thus, to a test 
on an input of length n corresponds an oriented hyperplane of R" (having an 
equation with integer coefficients). This allows us to define a notion of size for 
the coefficients of tests: 

Remark 1 All tests performed by a program running in time q(n) have coeffi- 
cients bounded by . 

Let t be a polynomial in Z[X], and £„ C R" the union of the hyperplanes in the 
arrangement defined in section 0 Before solving the point location problem 
for 7dn we will describe an algorithm for recognizing The union of the 
for n > 1 is a language of R°° denoted Lt- 

Proposition 1. For any polynomial t, Lt is in Pr^^^(NP). 

The remainder of this section is devoted to the proof of Propositiondand Theo- 
rem Q. The algorithms are based on a construction of Meyer auf der Heide |7ISj . 
who has described families of polynomial depth linear decision trees deciding 
unions of hyperplanes0 We shall build a uniform machine with an oracle in NP 
performing the same computation. The proof of Proposition ^ is in three steps: 
in section EH we describe an algorithm for deciding a union of hyperplanes on 
[— 1, 1]". The size of the tests in this algorithm is analyzed in section and 
the algorithm for deciding a union of hyperplanes in R" is then obtained in 
section lO Finally, this algorithm is turned into a point location algorithm in 
section El 

^ As mentioned in the introduction, he has also described families of linear decision 
trees solving the whole location problem. 
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2.1 Deciding a Union of Hyperplanes on a Cube 

We now describe a recursive method for deciding £„ = ^ 

Lemma 1. There is a Pg (NP) algorithm which, for any input x G R", decides 
whether a: G £„ fl [— 1, 1]". 

For the complete proof that the algorithm described below really is Pr (NP) 
we need polynomial size bounds on the coefficients of tests. These bounds are 
established in section E3 

Given a point y G K" and a set A C R", we denote by AfF(y,A) the affine 
closure of {y} U A, and by P{y, A) = {Xy + (1 — A)a;, a; G A, A < 1} the pyramid 
of top y and base A. Recursion on the dimension n of the cube is made possible 
by the following lemma. 

Lemma 2 (Meyer auf der Heide). Let S = {hi,...,hp} be a set of hy- 
perplanes in R" such that the intersection I = Hti nonempty. Let A be 
a polytope on a hyperplane hg which does not contain I, and let s be a point 
in I \ ho . If a program decides L' = ljr=i ^ on A, then the program ob- 
tained by replacing each test h' by Aff(s, h') (with the appropriate sign) decides 
L = [j^^i h on P{s,A). 

Let X G P{s, A). The previous lemma is clear when we notice the equivalence 



a: G /ii U . . . U Zip {sx) (1 ho G {hi fl ho) U . . . U (Zip fl ho) ■ 

Now we need a definition. A number r > 0 is a coarseness of a set of hyper- 
planes hi, . . . ,hk of R" if, for any ball B of radius r, either {hi, Zi^ fl R yf 0} = 0 
^ ^ coarseness of In 0 it is shown that one can 

take l/r„ = 

Here is how the algorithm works. 

If n = 1 we can decide whether a: G £„ by binary search (no NP oracle is needed) . 
We now assume that n > 1, and set = Hn- 
0 Step 1. 

We subdivide the cube = [—1,1]" in little cubes with radius smaller than 
r„ (i.e., with edge length smaller than 2r„/yn). By binary search on each co- 
ordinate, we find a little cube c), such that x G c\. Let us call H), = {h G 
H),, Zi n 0}. There are two cases : 

(i) Hi = 0. 

(ii) Otherwise, H/igbi Zi y^ 0 by definition of coarseness. 

We can check with a boolean NP algorithm whether (i) holds, and reject 
the input if this is the case. If it turns out that we are in case (ii) we com- 
pute in polynomial time with the help of a boolean NP oracle a point si in 
order to do this we will in fact compute a strictly decreasing 
sequence Ei,. . . ,Ej of affine subspaces such that Ei is an element of Hi and 
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Ej = (note that j < n). Since the condition ^ 0 is in NP, we can com- 
pute El by prefix search with an NP oracle. can be computed from E^ as 

follows. If it turns out that Ei C h for all h G we can halt with = Ei. 
This condition can again be checked with a boolean NP oracle. Otherwise there 
exists h G such that dim(/i n Ei) = dim Ei — 1 (since we are in case (ii) the 
case h n Ei = $ is not possible). Such an h can be determined by prefix search, 
and we set = EiC\h. Finally, when Ej = has been determined we can 
easily compute a point on this affine space (e.g. by fixing some coordinates 
0). If a:; = we accept the input. Otherwise we repeat the same procedure in 
dimension n — I as explained below. 

First we determine a face /,( of the cube such that x is in P(s^, fX), the 
pyramid of top and base /^. Let gl^ be the affine closure of (the equa- 
tion of is of the form Xi = ±1). Then Lemma El applies, so it remains to 
solve a (n— l)-dimensional problem: decide whether the point (s^x) lies on 
(ffn the (n — l)-dimensional cube [—1, 1]"“^ of g^. An important 

point is that if r is a coarseness of a set {h, hi, . . . , hp} of hyperplanes, then r 
is a coarseness of O /ii, ..., /i O /ip on /i. Since the hyperplane which plays the 
role of h (namely g)^) is an element of TLn, this will allow us to subdivide the 
(n — l)-dimensional cube with the same (and this remains true at further 
steps of the recursion) . 

0 Step k {1 < k < n). 

At this step, we work on the cube C* = [—1, of the affine space {xij = 

El, , Xif,_,^ = £fe_i} with a projected point x^ (the values of £j G {—1, 1} and 
of the ij depend on which face of was chosen as base of the pyramid at step j ) . 
We subdivide in smaller cubes, and then locate x^ in a little cube of C^. 
Note that the coordinates of x'^ need not be computed explicitly. Instead, a test 
h' on x^ is done by performing the test Aff(s^, (Aff(s^, Aff(. . . , Aff(s^“^, /i') . . .) 
on X. Let be the subset of hyperplanes of Hn that intersect all little cubes 
c\, ... ,c^ found up to step k. We know that if x lies on an hyperplane of 
this point must in fact lie on an hyperplane of Hf). If = 0 we reject x as in 
step (i). Otherwise we compute a point G ri/ieff'' ^ (h)- li k = n we 

accept X if X = s^, and reject otherwise. If k < n we determine ik and £k, and 
go one step further into the recursion. 

2.2 Coefficients of Tests in the Location Algorithm 

What is the running time of the procedure described in section IZ. 1 1 As r„ = 
2" * \ locating a point in a small cube by binary search always takes polynomial 
time. Moreover, it is clear that the number of operations performed (over Z) to 
compute the coefficients of tests is polynomially bounded. To make sure that 
the algorithm is really Pk^^^(NP), it just remains to check that the coefficients 
of these tests are of polynomial size. Details can be found in 
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2.3 Deciding a Union of Hyperplanes on M” 

Following Meyer auf der Heide, we can extend the method described previously to 
decide a union of hyperplanes on R" . This must be done without multiplication, 
and we have to keep the coefficients small. Details can be found in This ends 
the proof of proposition 1. 



2.4 Proof of the Location Theorem 

We are now ready for the proof of TheoremEl Let x = (xi, . . . , x„) be the point 
to be located. We use a perturbation method to turn the algorithm of sectionISl 
into a point location algorithm. Set x = (xi + £i, . . . , x„ + £„) where ei, . . . , e„ 
are infinitely small, positive, and £i <C £2 • • • <C £„. Now we run on input x 

the algorithm of Proposition D] for deciding £„ = U/iew course we know 

in advance that x will be rejected since this input does not lie on any hyperplane 
with integer coefficients; the point is that the collection of all tests performed on 
X during the execution of this algorithm is a system which locates x in A{'Hn)- 
Let S = {fi{y) < 0, . . . , fq{y) < 0} be this system. Then for each i we test 
whether fi{x) < 0 or fi{x) = 0: this yields a new system S, which locates x 
in A{'Hn)- Moreover the size conditions are fulfilled: S is made of affine 

(in)equations with integers coefficients of size 



3 Transfer Theorems for Moris 

We first recall some notations and results on the structure Ro^is = (M, +, — , <) of 
the reals with addition and order. A real language (or real problem) is a subset of 
= UneN®"- The boolean part BP(L) of a language L C is by definition 
L n {0, 1}°“. For a class C of real languages, BP(C) = {BP(L), L G C}. 

Fact 1 BP(P°_) = P, BP(NP°^„J=NP, BP(PR_) = P/poly andBP(NPK_) 
= NP/poly. 

We recall that NDP~ is a “digital” version of NP~ where certificates 
are required to be boolean. More precisely, a problem A C R.°° is in NDP^^^^ if 
there exists B € Pr„„„ and a polynomial p such that 

X = (xi, ... ,Xn) € A Bz € {0, (x,z) e B . 



Fact 2 NP^^^^ = NDP^^^^ . 

The proofs of these results can be found in |S| and [2|. We can now state and 
prove a key transfer theorem. 

Theorem 3. C P°_^^_^(NP). 
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Proof. Let L £ By Fact m there exists A € and a polynomial r 

such that 

(xi, . . . , Xn) £ L £ {Q, z) £ A. 

For any a; G R” and z £ {0, 1}’’!"-), the condition (x, z) £ A can be checked in 
polynomial time t(n) by a Turing machine T over Rous, and the polynomial t 
depends only on A. Let T-Ln be the set of all hyperplanes of R" with coefficients 
in For any z G {0, if we run A(.,z) on the formal 

input {Xi, . . . ,Xn), each test is of the form O'iXi + a„+i > 0, with ai in 

{— 2*("), . . . , 2‘(”)}. As a consequence, L fl R” is a union of faces of v4("H„). The 
Prous algorithm deciding L works in two steps. 

First, the input x = (cci, . . . , a;„) is located in By TheoremEl this can 

be done in FPg_^^^(NP). The output is a system S of (in)equations of the 

form hi{x) < 0 or hi{x) = 0 such that the hfs have polynomial size coefficients 
and the set Ps of the points of R" satisfying S is included in a face of A{'Hn). 

Then it remains to check whether Ps is included in L or in its complement. 
This can be done by a standard NP algorithm: we guess a certificate z G {0, 1}’’!") 
and an accepting computation path of T. The set of inputs of R" which given 
z follow this computation path is a polyhedron P defined by a system of linear 
inequalities of polynomial size. We accept a; if P fl P5 yf 0. This linear program- 
ming problem can be solved in polynomial time. (Another method consists in 
guessing a rational point q £ Ps with small coordinates - such a point exists, 
see 0 - and then running T on (q,z)). 



Remark 2 Other inclusions can be obtained with this method. The point loca- 
tion algorithm described above can be used for any real language L in a complexity 
class Cr C PARg^^^ . Then it remains to check whether the resulting polyhedron 
Ps is included in L or in its complement. If C is “reasonable” , this will be fea- 
sible in Cz,. For example, we have PPr„„^ C P°_^^^(PP), C 

for k £ N and PARr_^^^ C Pr_^^^(PSPACE). Of course we have C 

Pr„„s(^^) For BPP, we only obtain C Pr^^^(NP©BPP) where © 

is the join operation. 

The results stated in the introduction (TheoremQland its corollary) are direct 
consequences of Theorem 0 

Proof (of Theorem^. If Pr^^^ = NPr_^^^, P = NP by Fact The converse 
follows immediately from Theorem 0 

Proof (of Corollary \]\). If Pr^„^ = NPr^^^, P/poly = NP/poly by Fact □ For 
the converse, consider a problem A in NPr_^^^. There exists B £ NPr and 
parameters ai, ... ,ap such that (xi, . . . ,Xn) £ A (xi, . . . , a;„, oi, . . . , ap) £ 
B. By Theorem OlP G Pr^^^(NP), hence B £ Pr^^^ (P/ poly) by the assumption 
P/poly = NP/poly. Encoding the advice function in an additional parameter 
yields B G Pr„„„, therefore A G Pr„„„ too. 
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We will now give a completeness result. For a real language L C let us 
define the integer part of L as 

IP(L) = U {(pi, . . . ,p„), (pi, . . . ,p„) e L and pi, . . . ,p„ e Z}. 

nSN 

For a real complexity class C, we set IP(C) = {IP(L), L G C}. 

Lemma 3. Let A C be such that IP (A) is Turing NP-Ziord. Then A is 
Turing -hard. 

Proof. By Theorem0 C Pj^_^^^(NP). As IP(A) is NP-hard, P{^_^^^(NP) C 

Pr„„s(IP (^))5 of course Pr_^^^ (IP(A)) C Pg^^^(A). Here IP(A) is a boolean 
language used as a boolean oracle: as explained after Theorem|21 such an oracle 
only handles inputs made of O’s and I’s. We conclude that NP^^^^ C Pg^^^(A). 

Let us recall the definitions of two classical real languages. The real knapsack 
problem Knapsack^ is defined by 

n 

Knapsackj^nM^“^^ = {(xi, . . . , Xn, 5 ), 3^1,...,^^ € {0,1}, ^^mxi = 5 } . 

The traveling salesman problem TSPr is the set of pairs {A, d) where A is a 
distance matrix of {1, . . . , n} and d G R"*', such that there exists a Hamiltonian 
tour over {1, . . . , n} of length at most d. The final result of this section follows 
immediately from Lemma El 

Proposition 2. Knapsack^ ond TSPr are Turing -complete. 

4 The Polynomial Hierarchy over M^,s 

The result Pr„^ yf NPr^^ was proved by Meer 0. In this section we show that 
similar arguments can be used to separate the lowest levels of the polynomial 
hierarchy PHr^^ . Separating the higher levels of the hierarchy is essentially “im- 
possible” due to the transfer theorems established in section 14.21 These results 
rely on elementary observations on the structure of subsets of R" definable in 
R„s (see Lemma El and Lemma El in particular). In the following, these subsets 
will simply be called “definable sets”. Remember that definable sets are defin- 
able without quantifiers since R^s admits quantifier elimination. Consequently, 
a definable set is nothing but a boolean combination of hyperplanes. 

As in the rest of the paper, we work with parameter-free machines unless 
stated otherwise. We point out in Remark E| at the end of the paper that it is 
straightforward to generalize our transfer theorems to the case of machines with 
parameters. 

We first recall the generic path method. Let M be a machine over R„g stop- 
ping on all inputs, and L the language decided by M. Given n G N \ {0}, we set 
Ln = LnR". The generic path in M for inputs of size n is the path obtained by 
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answering no to all tests of the form ”h{x) = 0 ?" (unless /i = 0, in which case 
we answer yes). 

This definition is effective in the sense that the generic path can be followed 
by running M on the formal input {Xi, . . . , Xn). If M is parameter-free this 
computation can be carried out on a classical Turing machine. Moreover, if M 
works in time t{n), it takes time 0{nt{n)^) to apply the generic path method, and 
the tests that are performed by M along this path can be computed effectively. 
Let us call {hi, . . . ,hr} these tests. We have r < t{n), and these hyperplanes 
have the following property: if the inputs following the generic path are rejected, 
L„ C /ii U . . . U otherwise these inputs are accepted and C hiU . . .U hr- 

Note that the generic path method can be applied to an affine subspace 
X C K.” instead of M", in which case we answer yes to a test ”h{x) = 0 ?" if 
and only if X C h. Remember also that a definable subset A of R." is dense 
on X iff it contains an open dense subset of X, and that this is equivalent to 
dim A r\ X = dimX. We summarize these observations in a lemma which will 
be used in section 14. '/!l In this lemma, Sys" denotes the set of systems of affine 
equations in n variables with coefficients in Z. For S G Sys", Ps denotes the 
affine subspace of R" defined by S. 

Lemma 4. Let A be a language o/ R°“ and A" = Ad R". We denote by L" 
the set of systems S G Sys" such that A"^ is dense in Ps, and by L the language 
L”. Assume that A S with C = PAR or C = for some k G N. 

Then L G Czj • 

Proof. Note that A" is definable for any A G (this is in fact true for any 
recursive language of R((^). We can therefore apply the generic path method 
described above to decide wether A" is dense in Ps. More precisely, consider 
first the case A G Pr„,,- Given a test hyperplane h, we can decide in polynomial 
time whether Ps Q hhy linear algebra (for instance, we precompute d = dim(P 5 ) 
and d -I- 1 points xi, . . . , Xd+i such that Ps = Aff(xi, . . . , Xd+i)', then we declare 
that Ps d h \i Xi G h ior dl\ i = 1, . . . ,d + 1). The same remark applies to the 
case C = PAR since test hyperplanes still have polynomial-size coefficients in 
this case. We conclude that L is in P if A G Pr„„, and L is in PARz^ = PSPACE 
ifAePARO^^. 

If A G for some fc > 1 we use the equivalence between real and 

boolean alternation for R„s |2]: there exists a polynomial p and B G Pr„„ such 
that for any x G R", x G A iff 

Qm G {0, G {0,l}P‘^’^^(x,yi,...,yk} G B 

(the quantifiers Qi alternate, starting with Qi = 3). The set A" is dense in Ps 
iff the statement 

Qivi G {0, • • • QkVk G {0, ir(")E„(yi, . . . , (1) 

is true. Here Fn{yi, ■ ■ ■ , Vk) stands for: “{x G R"; (x, j/i, . . . , yu) G B} is dense in 
Ps” . Since B G Pr„„, we know that P„(yi, . . . , yk) can be decided in polynomial 
time by the generic path method. Therefore ([3 shows that L G E^. 
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Note that in the case C = it is really necessary to go to boolean quantification 
before applying the generic path method (think for instance of the set of points 
a; G M defined by the formula 3y x = y). 

4.1 Separation of Lower Levels 

It was pointed out in m that the Twenty Questions problem might be a plau- 
sible witness to the separation Pc yf NPc- Formally, this problem is defined as 
follows: 

TQ= eK”, G {0, . . . , 2” - 1}}. 

nGN 

Twenty Questions can be used to separate from 

Proposition 3. n U ^ 0. 

Proof is omitted. 

4.2 Transfer Theorems for the Polynomial Hierarchy 

In this section we show that it will be considerably harder to separate the higher 
levels of than the lower levels. We begin with two lemmas. Lemma 0is a 

remark on the structure of definable subsets of K", and in Lemma |H| we build a 
generic 27^^^ formula deciding a definable set A with the help of the predicate 
dim 5" n A = dim S' (the variable S represents an affine subset of K"). 

Lemma 5. Any nonempty definable set A C K." can be written as 



A = Ek\{Ek-i\{...\Eo)) 

where Ei is a finite union of affine subspaces, Ei-i C Ei, Ei = Ei \ Ei-i, and 
k < dim A. 

Proof. If dim A = 0 the result is clearly true since A is a finite set of points. 
Assume by induction that the result is true for all definable sets of dimension at 
most d — 1, and let A be a definable set of dimension d. The topological closure 
A of A is a finite union of affine subspaces. If A = A we set k = 0 and Ek = A. 
Otherwise, consider the definable set Ai = A\ A. Since dim Ai < d — 1, for some 
k < d one can write by induction hypothesis Ai = Ek-i \ {Ek -2 \ (• • • \ Eq)) 
with Ei a finite union of affine subspaces, Ei-\ Q Ei, Ei = Ei \Ei-i. Since 
A = A \ Ai, we can take E^ = A. 

Lemma 6. For any definable set A C K." we have: 

{Xi,...,Xn) G A dS'iVS'2 
X € S\ A dim A 0 S'! = dim 
A (dim A n S'! n S '2 < dim S'! 0 S '2 => a; ^ S'! 0 5*2 ) 

where S\ et S 2 are affine subspaces o/R". 
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Proof. The result is clearly true \i A = %. Otherwise, write A = Ek\ \ 

(. . . \ Eq)) as in Lemma|51 Let x G A and ig = min{i; i = k mod 2, cc £ E^}. 
Then x ^ -Eio-i: if x belonged to since x G A there would exist i < io 

such that i = k mod 2 and x G Ei. This would be in contradiction with the 
minimality of *o- 

We first show the implication from left to right: let be a maximal affine 
subspace in E^^ containing x. Since x G Si and x^Ei^_i the strict inclusion 
SiDEig-i C Si holds. Hence dim Si \ =dimS'i and dimHoS'i ^dimS”!. 
At last, if dim AOS'! n5'2 < dim S*! nS'2, then S*! nS'2 C Ei^-i. Thus x ^ S'! 0 52. 

Conversely, assume now that x satisfies the formula for = S. Since A 0 S' 
is definable, by Lemma El we can write An S = Ek \ {Ek-i \ (. . . \ Eq)). Here 
Afe = AnS = S (the second equality follows from dim A 0 S = S). Ej._i is 
a finite union of affine subspaces. For any subspace S2 in this union we have 
dim AnSnS2 < dimSnS2, therefore x^Sr\S 2 - This shows that x^Ek-i, hence 
a; G A n S. 



Remark 3 If the definable set A in the above lemma is a boolean combination 
of hyperplanes with coefficients in some subset 2? C R, then we can quantify only 
on affine subspaces defined by systems of affine equations with coefficients in T>. 

We can now state and prove our transfer theorems for Note that there 
is a two level shift in Theorem E] and Theorem El 

Theorem 4. P = PSPACE PAR^^^ = n (21®^ 

Proof. Let us assume that P = PSPACE, and let L G PAR^^^ be decided by a 
family of P-uniform circuits with depth t(n). It is enough to show that PAR[^^_^ = 
since PAR^^^ is closed under complement. By Lemma El L is decided 
by the following formula: 

(xi,...,Xn) G L 35iV52 
X G Psi A dim L" fl Ps^ = dim Ps^ 

A(dimL" n PsiUSs < dimP^^uSs ^ TlsiuSs) 

where = L C\ R", 5i and ^2 are systems of at most n affine equations with 
coefficients in {— 2*^’^), . . . , 2*^")} (Remark El, and Ps is the subspace of R" 
defined by 5. By Lemma 0 the condition dimL„ n P5 = dimP^ can be checked 
in PSPACE, and therefore in P by hypothesis. 

Theorem 5. For all k >0: 

PH = ^ PH°^^ = (A^+2)0_ 

Proof. Consider a problem L G PH^^^: we have L G (A^ )° for some g > 0. As 
in the proof of the previous theorem, we use the formula from LemmaEl Since 
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t)y Remark 0 we can still quantify on systems of equations 
with coefficients in {— . . . , for some polynomial p. By Lemma E] 

the condition dimPg fl = dimPg can be checked in and thus in 
by hypothesis. Putting the resulting formula in prenex form shows that L G 




Our final transfer theorem is based on a slightly different technique. 

Theorem 6. P = NP ^ O = P°„^. 

Proof is omitted. 

Remark 4 The three transfer theorems of this section can be extended to the 
case of machines with parameters. For example, let us show that PH = E^ => 
PHr^^ = For any problem L G PHr^^ there exist parameters ai,. . . ,ap 

and a problem U € PHr^^ such that {x\, . . . , Xn) G L iff {x \, . . . , ai, . . . , ap) 
G L' . By Theorem\^ L' G under the assumption PH = E’^ . This implies 

thatLGE^+\ 
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Abstract. Overregularization seen in child language learning, re verb 
tense constructs, involves abandoning correct behaviors for incorrect ones 
and later reverting to correct behaviors. Quite a number of other child 
development phenomena also follow this U-shaped form of learning, un- 
learning, and relearning. A decisive learner doesn’t do this and, in gen- 
eral, never abandons an hypothesis H for an inequivalent one where 
it later conjectures an hypothesis equivalent to H. The present paper 
shows that decisiveness is a real restriction on Gold’s model of itera- 
tively (or in the limit) learning of grammars for languages from positive 
data. This suggests that natural U-shaped learning curves may not be a 
mere accident in the evolution of human learning, but may be necessary 
for learning. The result also solves an open problem. 

Seeond-time decisive learners conjecture each of their hypotheses for a 
language at most twice. By contrast, they are shown not to restrict Gold’s 
model of learning, and correspondingly, there is an apparent lack of re- 
ports in child development of the opposite, W-shaped learning curves. 



1 Introduction 

In developmental and cognitive psychology there are a number of child devel- 
opment phenomena in which the child passes through a sequence of the form: 
child learns behavior X, child Mnlearns X, and then child relearns X P3|- This 
performance is described as U-shaped. For example, we have the important case 
of X = regularization in language aequisition j7J- We explain. In the learning of 
the proper forms for past tense (say, in English), children learn correct syntactic 
forms (for example, ‘called’ with ‘call’ and ‘caught’ with ‘catch’), then they over- 
regularize and begin to form past tenses by attaching regular verb endings such 
as ‘ed’ to the present tense forms (even in irregular cases like ‘catch’ where that 
is not correct), and lastly they correctly handle the past tenses (both regular 
and irregular). We also see U-shaped sequences for child development in such 
diverse domains as understanding of temperature, understanding of weight con- 
servation, the interaction between understanding of object tracking and object 
permanence, and face recognition M- Within some of these domains we also see 
temporally separate U-shaped curves for the child’s qualitative and quantitative 
assessments H3I. 

One wonders if the seemingly inefficient U-shaped sequence of learning, un- 
learning, and relearning is a mere accident of the natural evolutionary process 
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that built us (e.g., language learning) humans or something that must be that 
way to achieve the learning at allj^ We do not answer this very difficult empir- 
ical question. But, in the present paper, in the context of Gold’s formal model 
of language learning (from positive data) |n|, we show there are cases where 
successful learning requires U-shaped sequences of learning, unlearning, and re- 
learning. More precisely, but informally, decisive learning 0 is learning in which 
the learner cannot (inefficiently) conjecture an hypothesis Hi, then conjecture a 
behaviorally znequi valent hypothesis H 2 , and then conjecture an hypothesis 
which is behaviorally equivalent to Hi. Hence, a decisive learner never returns 
to abandoned hypotheses and therefore continues to output correct hypotheses 
from the time it has output its first correct hypothesis. A consequence of our 
main results in the present paper (Theorem and Corollary ED is that there 
are some classes of r.e. languages learnahle from positive data which cannot be 
learned decisively. Hence, there are cases where U-shaped curves, featuring un- 
learning, are necessary. It would be interesting in the future to characterize these 
cases in a way that provides insight into why we see so many cases of unlearning 
in child development. 

A text for a language L is an infinite sequence of all and only the elements 
of L (together with some possible #’s)0 A text for L should be thought of as 
a presentation of the positive data about L. Gold’s model of language learning 
from positive data 0 is also called E'X.-learning from text. A machine M EX- 
learns from text a language L iff (by definition) M, fed any text for L, outputs 
a sequence of grammar^ and this sequence eventually converges to some fixed 
grammar for L. A machine M BC-feorns from text a language L iff (by 
definition) M , fed any text for L, outputs a sequence of grammars, and this 
sequence eventually converges to nothing but grammars for L0 

Our main result, i.e.. Theorem 0 in Section 0 shows that there are classes 
of r.e. languages which can be EX-learned from text, but cannot be decisively 
BC-learned from text. From this we obtain in Gorollaries Q and 0 that deci- 
sive learning limits learning power for EX-learning and BC-learning from text, 
respectively. The latter result on BC-learning has been nicely shown by Fulk, 
Jain and Osherson [3|, whereas the result on EX-learning is apparently new and 
answers an open question from 0. Note that it has already been known before 
that when learning programs for functions, decisiveness does not limit learning 
power, see Remark for references and further explanation. 

We informally define second-time decisive learning as learning in which, for 
each text input to the learner, there is no conjectured subsequence of hypotheses 
Hi, H 2 , H^, H 4 , H^ such that Hi is behaviorally equivalent to i /3 and i /5 but 



^ In |Yli;-ij the concern is not this evolutionary question. It is, instead, about how 
humans are actually doing U-shaped learning (and not about why humans bother 
to employ U-shaped learning in the first place). 

^ The elements of L might occur arbitrarily often and in any order. The ^’s represent 
pauses. The only text for the empty language is an infinite sequence of such pauses. 

® We have in mind type-0 grammars, or, equivalently, r.e. indices. 

^ EX-learning from text involves syntactic converge to correct grammars. BC-learning 
from text involves semantic or behaviorally correct convergence. 
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mequi valent to H2 and i/4 0 Contrasting interestingly with our main result we 
show in Proposition in Section ^ that the learning power of second-time 
decisive EX-learners is the same as that of unrestricted EX-learners. Hence, 
the additional power of non-decisive learning is already achieved if we allow 
the learner to “return” to each abandoned hypothesis at most once. Hence, 
interestingly coinciding with the apparent lack of reports of W-shaped learning 
in child development, we see that in Gold’s paradigm (generally influential in 
cognitive science cni), W-shaped learning is not necessary. 

We also show ('Proposition HT )1 in Section 0 ) that EX-learnable classes which 
contain the entire set of natural numbers, N, do have a decisive EX-learner|l 

2 Decisive Learning 

Next we present the definition of decisiveness formally. We use the variable a 
(with or without subscripts) for finite initial segments of texts and call them 
strings. The range of a string a is the set of non-pauses in cr and is denoted 
by rng((r). We write ^ for the prefix relation between strings and texts, i.e., 
for example ui ^ 172 just in case ui is a prefix of ct 2. We write err for the 
concatenation of the strings a and r . The index M (a) is machine M’s conjectured 
grammar based on the information contained in a and W m ( a-) is the language 
defined by the grammar M{a). 

Definition 1. A learner M is decisive on a set S of strings iff there are no 
three strings a \ , tT2 and 173 such that and 73 are in S, cri ^ 172 ^ <^3 and 
Ml M{crx) differs from Wm(o-2) but is equal to Wm(ct3)- 

A learner M is decisive iff it is decisive on the set of all strings. 

So a decisive learner avoids U-shaped learning curves as discussed in the in- 
troduction. We conclude this section with a series of remarks and their proofs 
describing some standard techniques for the construction of decisive learners. 

Remark 2. A finite class C can always be decisively EX-learned. 

In order to obtain a corresponding learner, we fix a total ordering on C which 
extends the partial ordering given by set theoretic inclusion. For every distinct 
pair of sets in C we let I contain the least number which is in the larger (in 
our total ordering) but not in the smaller set. The learner then is equipped 
with a finite table which contains canonical indice^l for I and for all sets in 
{C ( 1 1 : C G C} . On input cr, the learner outputs the index of the least set in 
C which contains the intersection of rng(cr) with /, and, in case there is no such 
set, retains the preceding index. 



® Equivalent hypotheses are just grammars generating identical languages; inequiva- 
lent hypotheses are grammars generating distinct languages. 

® A basic trick from m is employed, modified, in the proofs of both our Proposi- 
tions 1 1 .^1 and El 

^ A canonical indexing m numerically encodes for each finite set both a procedure 
to list it and its size (the latter so one knows when the listing is finished). 
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Remark 3. If a learner M is decisive on two sets and S2 of strings such that 
the classes {Wm((t) : cr in S'!} and {W M(a) '■ ^ in S2} are disjoint, then M is 
actually decisive on the union of Si and S2- 

For a proof, assume that there were strings cti, (T2, and (T3 with ui ^ ct 2 ^ <73 
where (Ti and (T3 are in the union of and S2 and Wm(<ti) is equal to Wm(ct 3)- 
In case cti and (T3 were either both in or both in 52, M could not be decisive 
on the corresponding set Si, whereas in case one of the strings were in 5 i and 
the other in S2, this would contradict the assumption on M, S\, and S2- 

Remark 4. By delaying the learning process, we can transform a learner Mi 
into a new learner M2 for the same class such that the outputs of M2 satisfy 
certain properties. Here on input a the learner M2 outputs Mi (7) where 7 is the 
maximal prefix of a such that M2 has already been able to verify that Mi (7) 
has the property under consideration. In the remainder of this remark, we make 
this idea more precise and we argue that while delaying the learning process this 
way we can preserve decisiveness. 

Formally, we fix a binary computable predicate on strings, written in the 
form F’cr(r), such that for all strings cti, (T2, and 7, 

[Poi{l) and (Ti ^ 0-2] implies PoAl) (1) 

and we define a partial function s on strings by 

s(cr) := max{7 : 7 ^ tr and Pa{l)} 

where it is to be understood that s(cr) is defined iff the maximization in its 
definition is over a nonempty set. Then by o, the function s is nondecreasing 
in the sense that if s is defined on strings ui and U 2 with cti ^ ( 72 , then s(( 7 i) is 
a prefix of s((72). 

In case s(cr) is defined, we let M2{cr) = Mi(s(cr)) and, otherwise, we let 
M2 (ct) = e for some fixed index e. We will refer to such a transformation of Mi 
by the expression delaying with initial value e and condition P and, informally, 
we will call the learner M2 a delayed learner with respect to Mi . For example, in 
the sequel we will consider delayings with conditions P, firstly, such that Pair) 
is true iff the range of r is contained in Wmi(t),|ct| and, secondly, such that Pa{T) 
is true for all cr and t where the computation of Mi on input r terminates in at 
most I (7 1 steps. The rationale for choosing these condition will become clear in 
connection with the intended applications. 

Now assume that we are given a class C where for every text T for a set in 
C, the values of the function s have unbounded length on the prefixes of T, that 
is, there are arbitrarily long prefixes r and a oiT with t ^ a such that Pa(j) is 
true. Then it is immediate from the definition of M2 that in case the learner Mi 
learns C under the criterion EX or BC, the delayed learner M2 learns C under 
the same criterion. 

Finally, assume that Mi is decisive and that either We yf Ml M i{u) for all 
strings u or We = Ml M x(x)i where A denotes the empty string. Exploiting that in 
both cases by assumption on e the set if = {cr : Wm2(ct) = We} is closed under 
taking prefixes, one can show that M2 is again decisive. 
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3 The Limits of Decisive Learning 

In this section, we show that decisiveness is a proper restriction for EX-learning 
from text. In the proof of the latter result we will use Lemma 0 which relates 
to a result stated in jSl Exercise 5-3]: every class which does not contain N and 
can be EX-learned, can in fact be EX-learned by a nonexcessive learner. Here a 
learner M is said to be nonexcessive if and only if M never outputs an index for 
N, that is, for all a, W m ( a) differs from N. Observe that Lemma 0can be shown 
with BC-learning replaced by EX-learning by essentially the same proof. 

Lemma 5. Let C be an infinite class where every finite set is contained in all 
but finitely many sets in C. If the class C can be decisively 'BC-learned from 
text, then it can be decisively BC-learned from text by a nonexcessive learner. 

Proof. By assumption there is a decisive BC-learner Mq which learns C from 
text. In case Mq never outputs an index for N we are done. So fix a string tq such 
that Wmo(to) = N and, by assumption on C, choose H N in C which contains 
rng(ro). For every text for A, the learner Mq must eventually output an index for 
A and consequently we can fix an extension r of tq such that Mq(t) is an index 
for A. But A differs from N and thus for all extensions of r, the decisive learner 
Mq can never again output an index for N (whence, in particular, N ^ (7) . In the 
construction of a nonexcessive BC-learner M as asserted in the lemma, the key 
idea now is to restrict the output of M to indices of the form Mq{t(j), except 
for at most finitely many additional indices of sets in C which do not contain 
the set D = rng(r). We partition the set of all strings into the sets 

S\ = {(T \ D % rng(cr)} and S2 = {cr \ D Q rng(cr)} 

and we partition C into the classes 

Ci = {Litl C -.D%L} and C2 = {L in C : D C L} . 

By assumption on C, the class C\ is finite and we can fix an decisive EX- 
learner Mi for Ci as in Remark El which in particular outputs only indices for 
sets in Ci. Concerning the class C2, first consider a learner M2 which on input 
a outputs Mo(rcr). We leave to the reader the routine task of showing that M2 
BC-learns C2 and inherits the property of being decisive from M. Let M2 be 
the learned obtained according to Remark a by delaying M2 with initial index e 
and condition P where e is an index for some set which neither contains D nor 
is in Cl and the condition P is defined by 

P<,(7) iff D is contained in |^| . 

By the discussion in Remark E] the delayed learner M2 is again a decisive BC- 
learner for C2 . Now we obtain a learner M as required where 



M((j) 



Mi(cr) in case a is in Si , 
M2(ct) otherwise . 
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In order to see that M BC-learns C, assume that M is presented to a text T 
for a set L in C. In case L is in Ci, the learner M agrees on all prefixes of T 
with the learner M\ for C\ while, similarly, in case L is in C 2 , the learner M 
agrees on almost all prefixes of T with the learner M 2 for C 2 . In order to see 
that M is decisive it suffices to observe that S\ and S 2 witness that M satisfies 
the assumption of Remark El The latter holds because Mi and M 2 are both 
decisive and because every set of the form W M 2 (a) contains D, whereas no set 
of the form 'Wmi(ct) contains D. □ 

Theorem El and Corollary 0 are the main results of this paper. 

Theorem 6. There is a class which can be 'E'K-lcarned from text, but cannot be 
decisively 'QC -learned from text. 

From Theorem El the following corollaries are immediate. Here Corollary 0 an- 
swers an open problem stated in 0 and 0, while Corollary0has been previously 
shown by Fulk, Jain and Osherson 0. We will argue in Remark^Jthat the proof 
of Corollary 0 in 0 neither yields Theorem El nor Corollary 0 

Corollary 7. The concept of decisive ElL-learning from text is a proper restric- 
tion of EiX.-learning from text, that is, there is a class which can be EiX.-learned 
from text, but cannot be decisively ElL-learned from text. 

Corollary 8. The concept of decisive ^C-learning from text is a proper restric- 
tion of 'BC -learning from text, that is, there is a class which can be BC-learned 
from text, but cannot be decisively BC -learned from text. 

Proof of the theorem. In order to construct a class C as required we define, for 
the scope of this proof, for all subsets H of N a notion of id by 

id(H) := min{m in N U { 00 } : m is not in A} . 

In terms of the id of a set we can already now summarize the features of the 
class C which are relevant for making it EX-learnable. We will have 

C = H U {Wg(m) : m in N} with H = {rng(cr) : a in Z} (2) 

where Z and g are defined below such that they have the following properties. 
The set Z is recursively enumerable and the function g is computable in the 
limit, that is, there is a computable function Tf in two arguments such that g{m) 
is equal to lims_>.oo 'g{m, s). Furthermore for all m, the set Wg(^m) has id m and 
there are at most finitely many a in Z where rng((r) has id m. 

Claim 1. The class C can be EX-learned from text. 

Proof. We construct a learner M which EX- learns C from text. Given a language 
L of id m in C, let Hm contain the sets of id m in H. By construction of C, 
the number m and the class Hm are both finite and L must be equal to Wg(m) 
or to one of the sets in Hm. Moreover, for every text for L almost all prefixes 
of the text have id m and, by assumption on Z and g, from m we can compute 
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in the limit g(m) and a list of canonical indices for the members of H^. Thus 
we can simply assume that M has access to m, g(jn), and H^, as long as we 
ensure that M is defined even on prefixes of the given text for L for which its 
approximations to these values are not yet correct. So the learner M can simply 
output some fixed index for rng(cr) in case this set is in Hm while, otherwise, M 
outputs g{rn). □ 

Claim 2. If C can be decisively BC-learned, then it can be decisively BC- 
learned by a primitive recursive learner. 

Proof. Assume that a learner Ni BC-learns C decisively. Consider the delayed 
learner N 2 obtained by delaying A^i with initial index fVi(A) and condition P 
where Pa{T) is true iff the computation of A^i on input r terminates after at most 
I (7 1 steps. Then N 2 is again a BC-learner for C which, by Remark 0 is again 
decisive. Moreover, N 2 can obviously be chosen to be primitive recursive. □ 

Fix an enumeration Mq,Mi,... of all primitive recursive learners such that 
Mi{a) is uniformly computable in i and a. According to Claim El in order to 
achieve that C cannot be decisively BC-learned, it suffices to ensure by diago- 
nalization that for all i, the class C is not decisively BC-learned by Mi. While 
diagonalizing against Mi we will exploit that for every string a with 

rng(CT) C Wm,(,t) , (3) 

if the learner Mi learns rng(cr) and Mi(a)i then it cannot be decisive. In order 
to see the latter observe that by extending tr to a text for rng(cr) we eventually 
reach a string r where Mi(r) is an index for rng(cr), while on every text for 
WMi(o-) which extends r, the learner Mi converges to nothing but indices for the 
previously abandoned guess Mi{a)- 

For the scope of this proof, a string a which satisfies @ will be called an 
f- witness. During the construction we try to diagonalize, for all indices i, against 
the learner Mi by putting rng(cr) and "W into C for some Awitness a. Here, 
however, we have to observe that the remaining choices in the definition of the 
class C amount to specify a set Z and a function g which have the properties 
stated above. 

We fix an effective enumeration of all pairs (cr, i) such that a is an i-witness. 
For every i, we let Zi be the (possibly finite or even empty) set such that for 
every natural number I > i, among all the i- witnesses with range of id the 
set Zi contains the one which is enumerated first and we let Z be the union 
of the sets Zi. Then Z is recursively enumerable by construction. Moreover, for 
every m, there are at most finitely many strings cr in Z such that rng(cr) has 
id m because each Zi contains at most one such witness and every set Zi with 
i > m contains no such witness. From the definition of the concept i-witness it 
is immediate that for all i, 

id(rng(cr)) < id(WMi(cr)) for all cr in Z* . (4) 

We have arranged by the definition of Z that the class C contains rng(cr) for all 
i and all witnesses in Zi. Thus in order to diagonalize against the learner Mi it 
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suffices to put ~W Mi (a) C for some a in Zj, where by definition of C this 

can be done by ensuring that the index Mi{a) appears in the range of g. Here, 
however, for every m, we can only put a single set Wg(m) of id w into C. In 
order to define g accordingly, we let for all i, 

Ei := {id(WMi(cr)) : cr in Zi} 

be the set of all ids which are realized by sets of the form W Mi(cr) with cr in Zi. 
Moreover, we fix a recursive function fc : — >■ N such that for all i and m where 

m is in Ei, firstly, the limit 



K{i,m) = lim k{i,m,s) 

s—^oo 

exists, secondly, k(i, m) = Mi{a) for some cr in Zi and, thirdly, W^(j has id m. 
Such a function k can be constructed by running on input i, m, and s for s steps 
some fixed procedure which enumerates in parallel the sets ~W Mi {a) with cr in Zi 
and outputs the index Mi{a) which first enumerated all numbers less than m 
but hasn’t yet enumerated m itself. Next we define sets Sm and a function h by 
a straightforward priority construction, which, however, is non-effective 

Sm = {* : m in Ei and h{l) ^ i for all I < m}, 

JminS'rn if S'™ 7^ 0, 

Hm) = < . 

* otherwise. 



Intuitively speaking, the set Sm contains the indices i such that, firstly, we have 
not yet diagonalized explicitly against learner Mi and, secondly, we have a chance 
to diagonalize now because Zi contains a witness cr such that Mi{a-) has id m. 
Moreover, for all i in Sm, such an index Mi{a) is given by We pick an 

appropriate easily computable index for the set N \ {m} and define 

, . \ K{h(m),m) if h(m) ^ 

g[m) := <^ , 

Cm otherwise. 



Claim 3. The function g is computable in the limit. 

Proof. By definition, g(m) is computable in the limit from h(m) and m, whence 
it suffices to show that the function h is computable in the limit. Now for given 
TO, the inductive definition of /i on 0, . . . , to depends only on the membership of 
numbers less than or equal to to in the sets Eq, Ei, . . . . Furthermore, each set 
Ei contains exactly the ids of sets of the form Mi{rj) with cr in Zi. Now for 
every string cr in Zi, the id of rng(cr) and hence by ® also the id of W Mi{(r) 
is bounded from below by i, whence the set Ei does not contain numbers less 
than i. In summary, we can compute h(m) if we are given canonical indices of 
the sets Fq, . . . , Em with Ei = Ei C\ {Q, . . . , to}. Here, by the definition of Ei, 
the set Ei contains exactly the numbers I < m such that for some cr in Zi the 
id of WMi(cr) is I and by 0) the range of such a string a has id at most to. Now 
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by definition of the sets we can effectively in i enumerate the finitely many 
(T in Zi such that rng((r) has id at most m. Moreover, for each such cr, the ids of 
the sets Mi{c 7 ),i^ ■ ■ ■ converge to the, possibly infinite, id of ^ Mi{a)^ 

whence in the limit this approximation converges to the finite value id(Mi(cr)) or 
reveals that the latter value is strictly larger than m. So we are done because by 
combining the two approximation processes just described we obtain an effective 
procedure which on input i converges to a canonical index for □ 

Claim 4- Let i be such that Ei is infinite. Then the class C contains a set of the 
form W Mi{<r) with tr in Z^, whence in particular Mi cannot learn C decisively 
from text. 

Proof. If there is an i-witness a in Z^ such that C contains Mi(a)^ then by 
construction the class C contains also rng((r) and hence cannot be learned deci- 
sively by Mi. So if there is an m with h{m) = i then g(m) is equal to Mi{a) for 
some a in Zi and we are done by definition of C. But assuming that h(m) differs 
from i for all m yields a contradiction because in this case h(m) is by definition 
an element of{0,...,i — 1} for all m in the infinite set Ei, while on the other 
hand the function h attains every value in N at most once. □ 

Claim 5. The set {id(L) : L G C and L infinite} is infinite. 

Proof. It suffices to show that for every given I, there is an infinite set in C which 
has finite id larger than 1. We fix an index i such that on input a, the learner 
Mi outputs an index for the set N \ {mg.} where Too- is the least number which 
is strictly greater than I and all the numbers in rng((r). Then every string a is 
an i- witness where the set Wm,(o-) Las finite id m^. Thus the set Ei is infinite, 
whence by ClaimEIthe class C contains a set of the form W Mi{a) as required. □ 

Claim 6. Let i be such that Mi BC-learns the class C from text. Then the set 
Zi is infinite. 

Proof. Fix an arbitrary natural number mo and by Claim El let L N be an 
infinite set in C of id m > mo. On every given text T for L, the learner Mi 
eventually converges to nothing but indices for L, whence almost all prefixes of 
T are ^-witnesses with range of id m. Now mo has been chosen arbitrarily and 
consequently there are ^-witnesses with range of arbitrarily large id, whence the 
set Zi is infinite by its definition. □ 

Claim 7. Let i be such that Mi does never output an index for N. If Zi is infinite, 
then also Ei is infinite. 

Proof. By construction, the strings in Zi all have different id, whence in case Zi 
is infinite, the set {id(rng((r)) : a G Zi} is infinite, too. Then the set Ei, which 
by definition contains just the ids of the sets "W Mi{a) with a in Zi, must contain 
arbitrarily large values because of 0 . Hence the set Ei is infinite because by 
assumption on Mi the sets of the form WMi(a) all have finite id. □ 
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Now assume for a contradiction that the class C can be decisively BC-learned 
from text. Then by Lemma 0 and the proof of Claim |3 this fact would be 
witnessed by an decisive learner Mi which never outputs an index for N. Thus 
the sets Zi and Ei are both infinite by Claims El and 0 whence by Claim 0 the 
learner Mi cannot learn C as assumed. □ 

Theorem 0 above and the following Remark 0 show that the concepts of EX- 
learning from text and decisive BC-learning from text are incomparable in the 
sense that for each of these concepts there are classes which can be learned under 
this concept but not under the other one. In Remark 0 which we state without 
proof, we exploit that one of the standard constructions of a class which can be 
BC- but not EX-learned from text actually yields a class which can be decisively 
BC-learned from text. 

Remark 9. The class of all sets of the form K U D where K is the halting 
problem and Z? is a finite subset of N can be decisively BC-learned from text, 
but cannot be EX-learned from text. 

Remark 10. Schafer-Richter (see jI2| and 0 Section 4.5.5]) showed that ev- 
ery class of functions which can be EX-learned can in fact also be decisively 
EX-learned. The same holds for BC-learning as shown by Fulk, Jain, and Osh- 
erson P] and, implicitly, by Freivalds, Kinber, and Wiehagen 0. 

4 The Power of Decisive Learning 

While we have shown in the last section that decisiveness properly restricts 
EX-learning, we will show now that in certain respects decisive and general 
EX-learning are rather close. We show that every EX-learnable class C can be 
learned under a criterion which is slightly more liberal than decisive EX-learning 
in so far as every abandoned hypothesis can be “reconjectured” at most once 
and that, furthermore, C can indeed be decisively EX-learned if it contains N. 

Definition 11. A learner M is second-time decisive iff there are no five strings 
(Ti, ... ,(T 5 with CTi ^ (72 ^ CT 3 ^ CT 4 ^ (T 5 such that is equal to Wm(o- 3 ) 

and but differs from ^ m{o^) and Wm(o- 4 )- 

So a second-time decisive learner avoids W-shaped learning but in general may 
show U-shaped learning. Due to lack of space, we omit the proof of Remark El 

Remark 12 . A learner M is second-time decisive if and only if there is a set S 
of strings such that M is decisive on S, as well as on the complement of S. 

In connection with Lemma below we recall the concept of locking sequence. 

Definition 13. Let a learner M , a set L and a string a be given. Then a is a 
locking sequence for M and L iffTng{a) is contained in L = m{u) and M{gt) 
is equal to M{a) for all strings r over WM{a)- Furthermore, a is a locking 
sequence for M iff a is a locking sequence for M and Wm{(t)- 

A learner M learns via locking sequences iff every text for a language L which 
is ElL-learned by M has a prefix which is a locking sequence for M and L. 
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The subsequent proofs in this section will use Lemma O below. Our proof of 
this lemma is an adaptation of the proof of the well-known fact that every class 
which can be EX-learned from text at all, can actually be EX-learned from text 
via locking sequences (see for example Theorem 13 in P] and the references cited 
there). To save space, we omit the proof of Lemma IT^ 

Lemma 14. Let g be a eomputable funetion sueh that every finite set is eon- 
tained in infinitely many of the pairwise distinct sets Wg(o), Wg(i), . . . and let 
the class C be ElL-learned from text by some learner Mq. 

Then there is a learner M and a set S such that M EX-Zearns C from text 
via locking sequences, M is decisive on S and on its complement and 



Wm(ct) is in 



{Wg(i) : t m N} 

{Wmo(t) --t is a string} 



if a is in S , 
if a is not in S . 



( 5 ) 



We have seen in Sect.Elthat there are classes which can be EX-learned from text, 
but cannot be learned so decisively. Proposition cni shows that the additional 
power of non-decisive learning is already achieved if we allow the learner to 
“return” to each abandoned hypothesis at most once. 

Proposition 15. Every class which can be ElL-learned from text can also be 
'EiX.-learned from text by a second-time decisive learner. 

Proof. By Remark El Proposition Elis a special case of Lemma El where we fix 
a computable function g such that for all i, the set Wg(^) is just {0, . . . , i}. □ 

The class constructed in the proof of Theorem Elin order to separate the concepts 
of general and decisive EX-learning does not contain the set N. By Proposi- 
tion ITHl which is again a direct consequence of Lemma this is no coincidence. 



Proposition 16. Every class which contains N and can be EiiX.-learned from 
text can be decisively EiK.-learned from text. 

Proof. Let C be EX-learnable and contain N. Fulk [ 2 | showed that C has a 
prudent learner Mq - such a learner outputs only indices of sets which it also 
learns. Since there is a locking-sequence r for N, Mq does not identify any finite 
language containing rng(r). Thus, defining IFg(j) = {0, 1, . . . ,i maxrng(T)} 
makes the sets Wg(i) different from all sets learned by Mq and thus also different 
from all sets conjectured by Mq. 

We apply Lemma El to Mq and g. We obtain an EX-learner M for C and 
a set S of strings such that M is decisive on S and its complement. Moreover, 
kFM(cr) and Wm(t)) are different for all tr G S' and g ^ S, whence M is already 
decisive by Remark 0 □ 

Remark 17. Fulk, Jain and Osherson [31 Theorem 4] give an example of a 
class L which can be BC-learned from text but cannot be learned so decisively. 
While their construction bears some similarities to the construction of the class 
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C in the proof of Theorem H their class L is not EX-learnable from text and 
consequently neither yields Theorem 0 nor a separation of decisive and general 
EX-learning as stated in Corollary 0 

For a proof, note that there is no set in L which contains the numbers (0, 0) 
and (1,0), whence by switching to some fixed index for N as soon as the data 
contains both numbers, every EX-learner for L can be transformed into an EX- 
learner for L which also identifies N. But then by Proposition [01 if the class L 
were EX-learnable, it were decisively EX-learnable, whereas by construction L 
is not even decisively BC-learnable. 
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Abstract. We present new polynomial-time approximation schemes (PTAS) for 
several basic minimum-cost multi-connectivity problems in geometrical graphs. 
We focus on low connectivity requirements. Each of our schemes either signifi- 
cantly improves the previously known upper time-bound or is the first PTAS for 
the considered problem. 

We provide a randomized approximation scheme for finding a biconnected graph 
spanning a set of points in a multi-dimensional Euclidean space and having the 
expected total cost within ( 1 -|- e) of the optimum. Eor any constant dimension and 
£, our scheme runs in time 0{n log n). It can be turned into Las Vegas one with- 
out affecting its asymptotic time complexity, and also efficiently derandomized. 
The only previously known truly polynomial-time approximation (randomized) 
scheme for this problem runs in expected time n ■ ^ in the 

simplest planar case. The efficiency of our scheme relies on transformations of 
nearly optimal low cost special spanners into sub-multigraphs having good de- 
composition and approximation properties and a simple subgraph connectivity 
characterization. By using merely the spanner transformations, we obtain a very 
fast polynomial-time approximation scheme for finding a minimum-cost fc-edge 
connected multigraph spanning a set of points in a multi-dimensional Euclidean 
space. For any constant dimension, e, and k, this PTAS runs in time 0{n logn). 
Furthermore, by showing a low-cost transformation of a fc-edge connected graph 
maintaining the fc-edge connectivity and developing novel decomposition prop- 
erties, we derive a PTAS for Euclidean minimum-cost fc-edge connectivity. It is 
substantially faster than that previously known. 

Finally, by extending our techniques, we obtain the first PTAS for the problem 
of Euclidean minimum-cost Steiner biconnectivity. This scheme runs in time 
0{n logn) for any constant dimension and e. As a byproduct, we get the first 
known non-trivial upper bound on the number of Steiner points in an optimal 
solution to an instance of Euclidean minimum-cost Steiner biconnectivity. 
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1 Introduction 

Multi-connectivity graph problems are central in algorithmic graph theory and have 
numerous applications in computer science and operation research [2,10,22], They are 
also very important in the design of networks that arise in practical situations [2,10], 
Typical application areas include telecommunication, computer and road networks. Low 
degree connectivity problems for geometrical graphs in the plane can often closely 
approximate such practical connectivity problems (see, e.g., the discussion in [10,22]). 

In this paper, we provide a thorough theoretical study of these problems in Euclidean 
space (i.e., for geometrical graphs). We consider several basic connectivity problems 
of the following form: for a given set S of n points in the Euclidean space find a 
minimum-cost subgraph of a complete graph on S' that satisfies a priori given connectivity 
requirements. The cost of such a subgraph is equal to the sum of the Euclidean distances 
between adjacent vertices. 

The most classical problem we investigate is the (Euclidean) minimum-cost k-vertex 
connected spanning subgraph problem. We are given a set S of n points in the Euclidean 
space R''* and the aim is to find a minimum-cost fc-vertex connected graph spanning points 
in S (i.e., a subgraph of the complete graph on S). By substituting the requirement 
of fc-edge connectivity for that of fc-vertex connectivity, we obtain the corresponding 
(Euclidean) minimum-cost k-edge connected spanning subgraph problem. We term the 
generalization of the latter problem which allows for parallel edges in the output graph 
spanning S as the (Euclidean) minimum-cost k-edge connected spanning sub-multigraph 
problem. 

The concept of minimum-cost fc-connectivity naturally extends to include that of Eu- 
clidean Steiner k-connectivity by allowing the use of additional vertices, called Steiner 
points. The problem of (Euclidean) minimum-cost Steiner k-vertex- (or, k-edge-) con- 
nectivity is to find a minimum-cost graph on a superset of the input point set S in R'* 
which is fc-vertex- (or, fc-edge-) connected with respect to S. For fc = 1, it is simply the 
famous Steiner minimal tree (SMT) problem, which has been very extensively studied 
in the literature (see, e.g., [1 1,16]). 

Since all the aforementioned problems are known to be A/'T’-hard when restricted to 
even two-dimensions for fc > 2 [8,18], we focus on efficient constructions of good ap- 
proximations. We aim at developing a polynomial-time approximation scheme, a PTAS. 
This is a family of algorithms {Ae} such that, for each fixed e > 0, Ae runs in time 
polynomial in the size of the input and produces a (1-F £)-approximation [15]. 

Previous work. Despite the practical relevance of the multi-connectivity problems for 
geometrical graphs and the vast amount of practical heuristic results reported (see, e.g., 
[9,10,22,23]) very little theoretical research has been done towards developing efficient 
approximation algorithms for these problems. This contrasts with the very rich and 
successful theoretical investigations of the corresponding problems in general metric 
spaces and for general weighted graphs (see, e.g., [10,12,15,17]). Even for the simplest 
(and most fundamental) problem considered in our paper, that of finding a minimum-cost 
biconnected graph spanning a given set of points in the Euclidean plane, for a long time 
obtaining approximations achieving better than a | ratio had been elusive and only very 
recently has a PTAS been developed [6]. For any fixed £ > 0, this algorithm outputs 
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a (1 + e)-approximation in expected time n (log The approximation 

scheme developed in [6] can be extended to arbitrary k and d, but in the general case the 
dependence on k and particularly on d makes the algorithm impractical. For an e > 0, the 

algorithm mns in expected time n • 2('°s ^ i°g(e Note that d 

plays a large role in the running time of these schemes. In fact, the result from [6] implies 
that for every d = f2{\ogn), even for A; = 2 no PTAS exists unless V = MV. Thus, 
the problem of finding a minimum-cost biconnected spanning subgraph does not have 
a PTAS (unless V = MV) even in the metric case. Hence, our restriction to Euclidean 
graphs in low dimensions plays an essential role in these schemes. 

A related, but significantly weaker result has been also presented in [5]. Here an 
optimal solution to the problem without allowing Steiner points is approximated to an 
arbitrarily close degree via tlie inclusion of Steiner points. 

When Steiner points are allowed in the minimum-cost Steiner fc-vertex- (or fc-edge-) 
connectivity problem, tire only non-trivial results are known for A; = 1, i.e., for the 
minimum Steiner tree problem (SMT). In the breakthrough paper [3], Arora designed a 
PTAS for SMT for all constants d. Mitchell independently obtained a similar result for 
d = 2 [19]. Soon after Rao and Smith [20] offered a significantly faster PTAS for SMT 
running in time 0{n log n) for a constant d. For A; > 2, the only result we are aware of 
is a v/S-approximation in polynomial-time for A; = 2 [14]. 



New results. In this paper we present new polynomial-time approximation schemes for 
several of the aforementioned connectivity problems in geometric graphs. We focus on 
low connectivity requirements. Each of our approximation schemes either significantly 
improves the previously known upper time-bound or is the first PTAS for the considered 
problem. 

Our main new result is a fast polynomial-time (randomized) approximation scheme 
for finding a biconnected graph spanning a set of points in a c(-dimensional Euclidean 
space and having expected cost within (1 -L e) of optimum. For any constant d and e, 
our algorithm runs in expected time 0{n log n). Our scheme is a PTAS for the problem 

in for all d such that 2‘^°‘* = poly(n), for some absolute constant c. We can turn 
our randomized scheme into a Las Vegas one without affecting its asymptotic time 
complexity. With a very slight increase of the running time (a constant factor provided 
that d and £ are constant) we can also obtain a deterministic ( 1 + £)-approximation. Our 
scheme is significantly, i.e., by a factor of at least \ faster than that 

from [6]. 

Since a minimum-cost biconnected graph spanning a set of points in a metric space 
is also a minimum-cost two-edge connected graph spanning this set, our PTAS yields 
also the corresponding PTAS for the Euclidean minimum-cost two-edge connectivity. 

We extend the techniques developed for the biconnectivity algorithms and present a 
fast randomized PTAS for finding a minimum cost A;-edge connected multigraph span- 
ning a set of points in a d-dimensional Euclidean space. The running time of our Las 
Vegas scheme is 0{n log n) +n2^ for any constant d and e. 

We are also able to improve upon the A;-edge connectivity results from [6] signifi- 
cantly. By showing a low-cost transformation of a A;-edge connected graph maintaining 
the A: -edge connectivity and developing novel decomposition properties, we derive a 
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PTAS for Euclidean minimum-cost A;-edge connectivity which for any constant d, e, 
and k, runs in expected time n (logn)®^^^ The corresponding scheme in [6] requires 
n (log ) time when d=2 and for any constant e, k. 

Furthermore, we present a series of new structural results about minimum-cost bicon- 
nected Euclidean Steiner graphs, e.g., a decomposition of a minimum-cost biconnected 
Steiner graph into minimal Steiner trees. We use these results to derive the first PTAS for 
the minimum-cost Steiner biconnectivity and Steiner two-edge connectivity problems. 
For any constant d and e, our scheme runs in expected time 0{n log n). As a byproduct 
of the aforementioned decomposition, we also obtain the first known non-trivial upper 
bound on the minimum number of Steiner points in an optimal solution to an n-point 
instance of Euclidean minimum-cost Steiner biconnectivity, which is 3n — 2. 



Techniques. The only two known PTAS approaches to Euclidean minimum-cost k- 
vertex- (or, fc-edge-) connectivity (see [5,6]) are based on decompositions of /c-connected 
Euclidean graphs combined with the general framework proposed recently by Arora [3] 
for designing PTAS for Euclidean versions of TSP, Minimum Steiner Tree, Min-Cost 
Perfect Matching, fc-TSP, etc. (For another related framework for geometric PTAS see 
[19].) In contrast to all previous applications of Arora’s framework using Steiner points 
in the so-called patching procedures [3,5], a patching method free of Steiner points is 
given [6] . (Steiner points of degree at least three are difficult to remove for /c-connectivity 
when k >2. This should be compared to the problems considered by Arora [3] and Rao 
and Smith [20] where the output graphs have very simple connectivity structure.) This 
disallowance in [6] makes it hard to prove strong global structural properties of close 
approximations with respect to a given geometric partition. 

Structural theorems in Arora’s framework typically assert the existence of a recur- 
sive partition of a box containing the n input points (perturbed to nearest grid points) 
into cubes such that the optimal solution can be closely approximated by a so called 
(m,r)-light solution in which, for every cube, there are very few edges crossing its 
boundaries. The structural theorem in [6] yields only weaker structural properties of 
approximate solutions ((m, r)-grayness and (m, ?-)-blueness) which bound solely the 
number of crossings between the cube boundaries and the edges having exactly one end- 
point within the cube. That bound is constant for “short edges” (i.e., edges having length 
within a constant factor of the side-length of the cube) and it is O (log log n) for “long 
edges” (assuming that k, d, and s are constant). Furthermore, most of the crossings are 
located in one of 2 d prespecified points. The weaker structural properties (especially the 
fact that there might be as many as 6>(log logn) edges having exactly one endpoint in a 
cube in the partition) lead to the high time complexity of the main dynamic programming 
procedure in the PTAS presented in [6]. 

We take a novel approach in order to guarantee stronger structural properties of 
approximate solutions disallowing Steiner points. Our approach is partly inspired by the 
recent use of spanners to speed-up PTAS for Euclidean versions of TSP by Rao and 
Smith [20]. In effect, for /c = 2, we are able to prove a substantially stronger structural 
property (r-local-lightness, see Theorem 3.1) than that in [6]. It yields a constant upper 
bound on the number of long edges with exactly one endpoint in a cube in the partition 
provided that d and e are constant. 
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Our proof relies on a series of transformations of a (1 + <i)-spanner for the input point 
set, having low cost and the so called isolation property [1], into an r-locally-light A:-edge 
connected multigraph spanning the input set and having nearly optimal cost. Without 
any increase in the cost, in case A; = 2, the aforementioned multigraph is efficiently 
transformed into a biconnected graph spanning the input point set. Furthermore, for 
the purpose of dynamic programming, we succeed to use a more efficient subgraph 
connectivity characterization in case k = 2 than that used in [5,6]. 

By using merely the aforementioned spanner transformations, we also obtain the fast 
randomized PTAS for finding a minimum-cost fc-edge connected multigraph spanning 
a set of points in a multi -dimensional Euclidean space. 

It seems unlikely that a cost-efficient transformation of a k-edge connected multi- 
graph into a A: -edge connected graph on the same point set exists. For this reason, in case 
of A:-edge connectivity, we consider an arbitrary A:-edge connected graph on the input 
point set instead of a spanner, and derive a series of cost-efficient transformations of 
the former into an r-locally-light A;-edge connected graph on the input set. The transfor- 
mations yield the fastest known randomized PTAS for Euclidean minimum-cost A:-edge 
connectivity. 

Our investigations of spanners with the isolation property, the explicit use of multi- 
graphs instead of graphs, and the proof that nearly optimal, low cost spanners possessing 
the isolation property induce r-locally-light sub-multigraphs having good approxima- 
tion properties are the main sources of the efficiency of the approximation schemes for 
Euclidean minimum-cost connectivity problems (without Steiner points) presented in 
this paper. 

By extending the aforementioned techniques to include Steiner points, deriving the 
decomposition of a minimum-cost biconnected Steiner graph into minimal Steiner trees, 
and using the generalization of (1 -|-e) -spanner to include Steiner points called (1 + e)- 
banyans in [20,21], we obtain the first PTAS for minimum-cost Steiner biconnectivity 
and Steiner two-edge connectivity. 

Organization of the paper. Section 2 provides basic terminology used in our approxi- 
mation schemes. In Section 3 we outline our new PTAS for Euclidean minimum-cost 
biconnectivity. Section 4 sketches the PTAS for Euclidean minimum-cost A; -edge con- 
nectivity in multigraphs. In Section 5 we derive our PTAS for Euclidean minimum-cost 
A:-edge connectivity in graphs. Section 6 presents the PTAS for minimum-cost Steiner 
biconnectivity and Steiner two-edge connectivity. Due to space limitations most of our 
technical claims and their proofs are postponed to the full version of the paper. 



2 Definitions 

We consider geometrical graphs. A geometrical (multi-)graph on a set of points S in 
is a weighted (multi-)graph whose set of vertices is exactly S and for which the 
cost of an edge is the Euclidean distance between its endpoints. The (total) cost of the 
(multi-)graph is the sum of the costs of its edges. A (multi-)graph G on S' spans S if it 
is connected (i.e., there is a path in G connecting any two points in S). As in [3,5,6,20], 
we allow edges to deviate from straight-line segments and specify them as straight-lines 
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paths (i.e., paths consisting of straight-line segments) connecting the endpoints. This 
relaxation enables the edges to pass through some prespecified points (called portals) 
where they may be "bent. ” When all edges are straight-line segments, G is called a 
straight-line graph. For a multigraph G, the graph induced by G is the graph obtained 
by reducing the multiplicity of each edge of G to one. 

We shall denote the cost of the minimum spanning tree on a point set X by mst(X ) ) . 
A t-spanner of a set of points S in R'* is a subgraph of the complete straight-line graph 
on S such that for any two points x, y G S' the length of the shortest path from x to y in 
the spanner is at most t times the Euclidean distance between x and y [1]. 




Fig. 1. Dissection of a bounding cube in (left) and the corresponding 2^-ary tree (right). In 
the tree, the children of each node are ordered from left to right: Top/Left square. Bottom/Left 
square, Bottom/Right square, and Top/Right square. 



We hierarchically partition the space as in [3]. A bounding box of a set S of points 
in R^ is a smallest d-dimensional axis-parallel cube containing the points in S. A (2^*- 
an^) dissection [3] (see Figure 1) of a set of points in a cube in R'* is the recursive 
partitioning of the cube into smaller sub-cubes, called regions. Each region of volume 

> 1 is recursively partitioned into 2'^ regions {U/2)'^. A 2'^-ary tree (for a given 2^^-ary 
dissection) is a tree whose root corresponds to L^, and whose other non-leaf nodes 
correspond to the regions containing at least two points from the input set (see Figure 1 ). 
For a non-leaf node v of the tree, the nodes corresponding to the 2'^ regions partitioning 
the region corresponding to v, are the children of v in the tree. 

For any d-vector a = (oi, . . . ,aa), where all Oj are integers 0 < a* < L, the 
a-shifted dissection [3,6] of a set X of points in the cube in R'^ is the dissection of 
the set X* in the cube (2 L)‘^ in R"^ obtained from X by transforming each point x e X 
to X + a. A random shifted dissection of a set of points X in a cube in R”^ is an 
a-shifted dissection of A' with a = (ai, . . . , a^) and the elements a\,... ,aa chosen 
independently and uniformly at random from {0, 1, . . . , i}. 

A crossing of an edge with a region facet of side-length W in a dissection is called 
relevant if it has exactly one endpoint in the region and its length is at most 2\/dW. 
A graph is r-gray with respect to a shifted dissection if each facet of each region in the 
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dissection has at most r relevant crossings. A graph is r-locally-light with respect to a 
shifted dissection if for each region in the dissection there are at most r edges having 
exactly one endpoint in the region. A graph is r-light [3,20,6] with respect to a shifted 
dissection if for each region in the dissection there are at most r edges crossing any of 
its facets. (It is important to understand the difference between these two latter notions.) 

An m-regular set of portals in a (d — l)-dimensional region facet is an 

orthogonal lattice of m points in the facet where the spacing between the portals is 
(N + 1) ■ (cf. [3]). If a graph is r-locally-light and for each facet in any 

region every edge crosses through one of the m portals in the facet then the graph is 
called (m, r)-locally-light. 

3 Algorithm for Euclidean Biconnectivity 

In this section we sketch a randomized algorithm that finds a biconnected graph spanning 
a set S' of n points in whose cost is at most ( 1+ e ) times of the minimum. We specify 
here only a key lemma and the structural theorem and defer the detailed description of 
the algorithm and its analysis to the full version of the paper. 

Our algorithm starts by finding a smallest bounding box for the input point set S, 
rescaling the input coordinates so the bounding box is of the form [0, 0(n Vd(l -|- 
1/e))]'*, and moving the points in S to the closest unit grid points. We shall term the per- 
turbed point set as well-rounded (see also [3,6]). Next, the algorithm finds an appropriate 
(1 -h 0(e) )-spanner of the well-rounded point set which has the so-called (k, c)-isolation 
property for appropriate parameters k and c [1], 

Definition 3.1. Let c, 0 < c < 1, be a constant and let k > 1. A geometrical graph G 
on a set of points in R^ satisfies the (k, c)-isolation property, if for each edge e of G of 
length I, there is a cylinder C of length and radius c I and with its axis included in e such 
that C intersects at most k edges of G. 

In the next step, the algorithm chooses a random shifted dissection and builds the 
corresponding shifted 2‘^-ary tree for the perturbed S. 

The following key lemma yields a low upper bound on the number of crossings of 
the boundaries of a region in the dissection by long edges of the spanner having exactly 
one endpoint within the region. 

Lemma 3.1. Let Gbea geometrical graph spanning a set S of points in R'^ and satisfy- 
ing the (k, c)-isolation property, where 0 <c <lis a constant There exists a constant 
c! such that for any region in a shifted dissection of S the number of edges of G of length 
at least c! \fd times the side-length of the region that have precisely one endpoint within 
the region is k ■ 

Combining this lemma, with several lemmas that describe graph transformations 
reducing the number of crossings of the boundaries of a region in the dissection by short 
edges, we obtain the following structural theorem. 

Theorem 3.1. Let e, A be any positive reals and let k be any positive integer. Next, let 
& be a {I + e)-spanner for a well-rounded set S of points in R'^ that satisfies the (k,c)- 
isolation property with constant c, 0 < c < 1, k = and has n • (d/e)‘^^'^'> 
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edges whose total cost equals Lq. Choose a shifted dissection uniformly at random. 
Then one can modify & to a graph Q spanning S such that 

- Q is r-locally-light with respect to the shifted dissection chosen, where r = 2^^+^ • 

d ■ + d - (0(A ■ + {d/e)^^’^\ and 

- there exists a k-edge connected multigraph H which is a spanning subgraph of Q 

with possible parallel edges (of multiplicity at most k), whose expected (over the 
choice of the shifted dissection) cost is at most ( 1 + e + <^ost of the 

minimum-cost of k-edge connected multigraph spanning S. 

Furthermore, this modification can be performed in time 0{d- L - log L) -y n ■ + 

n • {d/e)^^'^\ where L is the side-length of the smallest bounding box containing S. 

Further, our algorithm modifies the spanner according to Theorem 3.1 producing 
an r-locally-light graph G where r is constant for constant e and d. In the consecutive 
step, the algorithm runs a dynamic programming subroutine for finding a minimum-cost 
two-edge connected multigraph for which the induced graph is a subgraph of G (note 
that by Theorem 3 . 1 the multigraph has expected cost very close to that of the minimum- 
cost of a fc-edge connected multigraph spanning the perturbed S). The efficiency of the 
subroutine relies on a new, forest-like characterization of the so called connectivity type 
of a multigraph within a region of the dissection. It is substantially more concise than 
the corresponding one used in [5,6]. Next, the algorithm transforms the multigraph to 
a biconnected (straight-line) graph without any increase in cost. Finally, it modifies the 
biconnected graph to a biconnected graph on the input set by re-perturbing its vertices. 

Theorem 3.2. The algorithm ftnds a biconnected graph spanning the input set of n 
points in M'* and having expected cost within (1 -F e) from the optimum. The running 

time of the algorithm is 0{n ■ ■ \og{nd/s)) -F n ■ \ In particular, 

when d and e are constant, then the running time is 0{n logn). For a constant d and 
arbitrary s = ^>lthe running time is 0{n s log(n s) -F n ). The algorithm can 
be turned into a Las Vegas one without affecting the stated asymptotic time bounds. 

Although we have used many ideas from [6] in the design of our algorithm, we have 
chosen the method of picking a random shifted dissection given in [3,20,21]. Therefore 
we can apply almost the same arguments as those used by Rao and Smith [2 1 , Sections 2.2 
and 2.3] to derandomize our algorithm at small increase of the cost 

Theorem 3.3. For every positive s there exists a deterministic algorithm running in 
time n{d/s)‘^^^'> logn + ' that for every set of n points in produces a 

biconnected graph spanning the points and having the cost within {l+e)ofthe minimum. 
In particular, when d and e are constant, the running time is 0{n log n). For a constant 
d and arbitrary s = ^ >1 the running time is 0{n log n + n 2®°'^' ). 

4 Euclidean fe-Edge Connectivity in Multigraphs 

We can extend the techniques developed in the previous sections to the problem of finding 
a low-cost A:-edge connected multigraph spanning a set of points in for A: > 2. To begin 
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with, we follow the PTAS from Section 3 up and inclusive the spanner-modification step, 
only changing some parameters. The resulting graph G is r-locally-light graph for r = 
k'^ ■ '> . By Theorem 3. 1, there exists a fc-edge connected multigraph H such that 

the induced graph is a subgraph of G and the expected cost of iL is at most ( 1 + f ) times 
larger than the minimum-cost fc-edge connected multigraph spanning S. As was the case 
for the PTAS from Section 3, we can apply dynamic programming to find a minimum- 
cost A: -edge connected multigraph H* for which the induced graph is a subgraph of G. 
This time we use a more general (but less efficient) connectivity characterization from 
[5] yielding 2®’^’"') different connectivity types. In effect, the dynamic programming is 
more expensive, and the total running time is n • Now, it is sufficient to re-perturb 
the vertices of the multigraph H* (correspondingly to the last step of the PTAS from 
Section 3) in order to obtain the following theorem. 

Theorem 4.1. Let k be an arbitrary positive integer. There exists an algorithm that 
finds a k-edge connected multigraph spanning the input set of n points in M'* and having 
expected cost within (1 + e) from the optimum. The running time of the algorithm is 

0{n ■ ■ \og{nd/e)) + n ■ ’)•). jn particular, when d and e 

are constant, then the running time is 0{n logn) -|- For a constant d and an 

arbitrary s = ^ > 1 the running time is 0{n s log(ns)) Thealgorithm 
can be turned into a Las Vegas one without affecting the stated asymptotic time bounds. 

Recall the use of Steiner points in the first attempt of deriving PTAS for the Eu- 
clidean minimum-cost fc-connectivity in [5] by allowing them solely on the approxima- 
tion side. As a byproduct, we can substantially subsume the results on minimum-cost 
^-connectivity from [5] in the complexity aspect by using Theorem 4.1. 

5 Euclidean fe-Edge Connectivity in Graphs 

Our spanner approach to biconnectivity relies on an efficient transformation of a two- 
edge connected multigraph into a biconnected graph on the same point set without 
any cost increase. Unfortunately, it seems that for A; > 2 there is no any similar cost- 
efficient transformation between A;-edge connected multigraphs and A;-vertex- or A;-edge 
connected graphs. We show in this section that an arbitrary (in particular, minimum- 
cost) A:-edge connected graph spanning a well-rounded point set admits a series of 
transformations resulting in an r-locally-light A;-edge connected graph on this set with a 
small increase in cost. By some further small cost increase, we can make the latter graph 
(m, r)-locally-light in order to facilitate efficient dynamic programming. 

The following lemma plays a crucial role in our analysis. Intuitively, it aims at 
providing a similar transformation of the input graph as that presented in Lemma 3.1. 
The main difficulty here is in the fact that the input graph may be arbitrary (while in 
Lemma 3.1 we have analyzed graphs having the isolation property). 

Lemma 5.1. Let G be a spanning graph of a well-rounded point set S in R'^. For any 
shifted dissection of S, G can be transformed into a graph G' satisfying the following 
three conditions: 

- if G' is different from G then the total cost of G' is smaller than that of G, 
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- for any region of size W inthedissectiontherearepositiverealsxi, * = 

greater than Af~dW such that the number of edges having theirs lengths outside 
any of the intervals [xj, 2xj), and each having precisely one endpoint within the 
region is 

- if G is k-edge connected then so is G' . 

Lemma 5. 1 guarantees that in the graph resulting from the transformation provided 
in the lemma no region in the dissection is crossed by too many long edges having their 
length outside finite number of intervals of the form [x, 2x), x > 4 y/d W and one point 
inside the region. The following lemma reduces the number of the remaining edges. 

Lemma 5.2. Let G be an r-gray spanning graph of a well-rounded set S of points in 
Let Q be a region of size W in the dissection of S, and let Xj, * = 1, . . . , 
be positive reals greater than 4 sfd W. If there are £ edges having their lengths outside 
the intervals [0, 2f~dW], [xj, 2xj), t = 1, . . . , and such that each has precisely 

one endpoint in Q, then there are at most £ + r edges crossing the facets of Q and 

having one endpoint in Q. 

By these two lemmas, we obtain our structure theorem for k-edge connectivity. 

Theorem 5.1. Let e > 0, and let S bea well-rounded set of n points in R'*. A minimum- 
cost k-edge connected graph spanning S can be transformed to a k-edge connected 
graph H spanning S such that 

- H is (m, r)-locally-light with respect to the shifted dissection, where m = ■ 

\fd ■ logn))‘*“^ and r = {0{k‘^ ■ / e))*^, and 

- the expected (over the choice of the shifted dissection) cost of H is at most (1 + e) 
times larger than that of the minimum-cost graph. 

The concept of (m, r) -local-lightness is a very simple case of that of (m, r) -blueness 
used in [6]. Therefore, we can use a simplified version of the dynamic programming 
method of [6] involving the fc-connectivity characterization from [5] in order to find an 
(m, r)-locally-light fc-edge connected graph on a well-rounded point set satisfying the 
requirements of Theorem 5.1. This yields a substantially faster PTAS for the Euclidean 
minimum-cost k-edge connectivity than that presented in [6]. 

Theorem 5.2. Let k be an arbitrary positive integer and let £ > 0. There exists a ran- 
domized algorithm that finds a k-edge connected graph spanning the input set of n points 
in R'^ and having expected cost within (1 + £) from the optimum. The expected running 
time of the algorithm is n ■ In particular, 

when k, d, and e are constant, then the running time is n (log n)‘^G)^ 

6 Euclidean Steiner Biconnectivity 

In this section we provide the first PTAS for Euclidean minimum-cost Steiner biconnec- 
tivity and Euclidean minimum-cost two-edge connectivity. Eor any constant dimension 
and e, our scheme runs in time 0{n logn). Our proof relies on a decomposition of a 
minimum-cost biconnected Steiner graph into minimal Steiner trees and the use of the 
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SO called (1 + e)-banyans introduced by Rao and Smith [20,21], As a byproduct of 
the decomposition, we derive the first known non-trivial upper bound on the minimum 
number of Steiner points in an optimal solution to an n-point instance of Euclidean 
minimum-cost Steiner biconnectivity which is 3n — 2. 

Since for any set of points X in R'^ the minimum-cost of a biconnected graph 
spanning X is the same as the minimum-cost of a two-edge connected graph spanning 
X, in the remaining part of this section we shall focus only on the Euclidean minimum- 
cost Steiner biconnectivity problem. By a series of technical lemmas, we obtain the 
following characterization of any optimal graph solution to the Euclidean minimum- 
cost Steiner biconnectivity problem. 

Theorem 6.1. Let Gbea minimum-cost Euclidean Steiner biconnected graph spanning 
a set S ofn > A points in M'*. Then G satisfies the following conditions: 

(i) Each vertex of G (inclusive Steiner points) is of degree either two or three. 

(ii) By splitting each vertex v of G corresponding to an input point into deg(u) indepen- 
dent endpoints of the edges, graph G can he decomposed into a number of minimal 
Steiner trees. 

(Hi) G has at most in -2 Steiner points. 

6.1 PTAS 

Our spanner-based method for Euclidean minimum-cost biconnectivity cannot be ex- 
tended directly to include Euclidean minimum-cost Steiner biconnectivity since spanners 
do not include Steiner points. Nevertheless, the decomposition of an optimal Steiner so- 
lution into minimum Steiner trees given in Theorem 6.1 opens the possibility of using 
the aforementioned banyans to allow Steiner points for the purpose of approximating 
the Euclidean minimum Steiner tree problem in [20]. 

Definition 6.1. [20] A {I -\- efbanyan of a set S of points in is a geometrical 
graph on a superset of S (i.e., Steiner points are allowed) such that for any subset U 
of S, the cost of the shortest connected subgraph of the banyan which includes U, is at 
most (1 + e) times larger than the minimum Steiner tree ofU. 

Rao and Smith have proved the following useful result on banyans in [20]. 

Lemma 6.1. Let 0 < e < 1. One can construct a ( 1 + sfbanyan of an n-point set in 
R'^ which uses only n Steiner points and has cost within a factor of 

dP^d ) £~o{d) ofthe minimum Steiner tree of the set. The running time of the construction 
is dP^d^'> {d/s)'d^^d) „ CX^dn logn). 

By combining the definition of (l+e)-banyan with Theorem 6.1 (2) and Lemma 6.1, 
we get the following lemma. 

Lemma 6.2. For a finite point set S in M.d let be a {I -\- ejA)-banyan constructed 
according to Lemma 6.1. Let S™ be the multigraph obtained from by doubling its 
edges. There is a two-edge connected sub-multigraph of%m which includes S and whose 
cost is within (1 + e/4) o/ the minimum cost of two-edge connected multigraph on any 
superset of S. 
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Let be the multigraph resulting from scaling and perturbing all the vertices (i.e., 
also the Steiner points) of the multigraph specified in Lemma 6.2 according to the 
first step in Section 3. The vertices of are on a unit grid [0, L]'^ and, by arguing 
analogously as in Section 3, a minimum-cost two-edge connected sub-multigraph of 
that includes S is within (1 + e/4) of a minimum-cost two-edge connected sub- 
multigraph of iBm that includes S. 

The patching method of [5] applied to yields the following structure theorem. 

Theorem 6.2. Choose a shifted dissection of the set of vertices of the banyan iB at 
random. Then can be modified to a multigraph such that: 

- iB/j is r-light with respect to the shifted dissection, where r = 

- the set of vertices of^'.^ includes that of^*^ and some additional vertices placed 
at the crossings between the edges of^*^ and the boundaries of the regions in the 
shifted dissection, 

- there exists a two-edge connected sub-multigraph of^ including S whose expected 
cost is within {l + e/A)ofthe minimum-cost of two-edge connected sub-multigraph 
of^*^ that includes S. 

To find such a subgraph of ^B)„ efficiently we apply a simplification of the dynamic 
programming method used by the PTAS from Section 3 to the set of vertices of iB^ 
(it would be even simpler to use a modification of the dynamic programming approach 
from [5]). In effect we can find IB/j in expected time 0{n logn) for constant e and d. 
By combining this with Lemma 6.2, Theorem 6.2, and the efficient transformation of 
two-edge connected multigraphs into biconnected graphs, we obtain the main result in 
this section. 

Theorem 6.3. There exists an approximation algorithm for the minimum-cost Steiner 
biconnectivity (and two-edge connectivity) which for any e > 0 returns a Euclidean 
Steiner biconnected (or two-edge connected) graph spanning the input set of n points in 
and having expected cost within ( 1 -F e) from the optimum. The running time of the 

algorithm is 0{nd^/‘^ \og{nd/e)) -\- (d/e)^^'^^ n -(- 

In particular, when d and s are constant, then the running time is 0{n logn). The 

algorithm can be turned into alas Vegas one without affecting asymptotic time bounds. 
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Abstract. Given as input an edge- weighted graph, we analyze two al- 
gorithms for finding subgraphs with low total edge weight. The first al- 
gorithm finds a separator subgraph with a small number of components, 
and is analyzed for graphs with an arbitrary excluded minor. The second 
algorithm finds a spanner with small stretch factor, and is analyzed for 
graphs in a hereditary family G{k). These results imply (i) a QPTAS 
(quasi-polynomial time approximation scheme) for the TSP (traveling 
salesperson problem) in unweighted graphs with an excluded minor, and 
(ii) a QPTAS for the TSP in weighted graphs with bounded genus. 

Keywords: graph minor, genus, separator, spanner, TSP, approxima- 
tion scheme. 



1 Introduction 

In the traveling salesperson problem (TSP) we are given n sites and their distance 
matrix, and our goal is to find a simple closed tour of the sites with minimum 
total distance. The TSP has driven both practical and theoretical algorithm re- 
search for several decades jS|. Most variants are NP-hard, and therefore much 
attention is given to approximate solutions for metric TSP, where the distance 
matrix is a metric (nonnegative, symmetric, and satisfying the triangle inequal- 
ity). An algorithm of Christofides jO] finds a metric TSP solution with cost 
at most 3/2 times optimal. We would prefer a polynomial time approximation 
scheme (PTAS); that is, for each e > 0, we would like a polynomial time algo- 
rithm which produces a solution with cost at most l-|-e times optimal. However, 
metric TSP is MAXSNP-hard even when all distances are one or two m, and 
so there is some positive Eq such that finding a l-|-eo approximation is NP-hard. 
Indeed, the 3/2 guarantee of Christofides has not been improved (although in 
practice, other heuristics are much better). 

However, there has been recent progress in certain restricted metric spaces. 
In 13 we found a PTAS for the TSP on the nodes of an unweighted planar 
graph, where the metric is given by shortest path lengths in the graph. We 
later generalized 0 this to allow distances defined by non-negative edge costs. 
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Arora 0 (also Mitchell gave a PTAS for the TSP and related problems 
for points in a Euclidean space of fixed dimension. Roughly speaking, all of 
these results depend on the ability to find inexpensive and “well-connected” 
separators. 

In this paper we extend the methods of 0 from planar graphs to larger 
graph families. This leads us to a general notion of well-connected separator 
that may have other algorithmic applications. We present two subgraph finding 
algorithms; in both algorithms the goal is a light subgraph, meaning that it 
has low total edge weight. In Section 0 we give an algorithm finding a light 
separator subgraph with few connected components, in graphs from any family 
with a nontrivial excluded minor. In Section 0 we give an algorithm finding a 
light spanner with low stretch factor, in graphs from a family Q{k), to be defined. 
This family includes graphs of bounded genus. 

Finally, in Section 0 we sketch a QPTAS (quasi-polynomial time approxi- 
mation scheme) for metric TSP in two situations. First, for the shortest path 
metric in an unweighted graph from a family with an excluded minor. Second, 
for the shortest path metric in an edge- weighted graph from family Q(k), for any 
fixed k. Both schemes run in time 



2 Preliminaries 

All graphs in this paper are undirected and simple. A graph G = {V, E) is edge- 
weighted if it has a non-negative weight (or length) £(e) on each edge e G E; it 
is vertex-weighted if it has a non-negative weight w{v) on each vertex v G V . A 
subgraph H oi G inherits these weights on its edges and vertices. The total edge 
weight and vertex weight in E[ are denoted by i{H) and w{E[), respectively. The 
number of connected components in iJ is denoted ^(iJ). 

When G is edge- weighted, da{u,v) denotes the minimum length £{P) of a 
path P connecting endpoints u and v. This is zero when u = v, and -l-oo when 
u and V are disconnected. G" = (V, E') is a spanning subgraph if it spans each 
component of G. Clearly dciujv) < dc'{u,v); the stretch factor sf(G',G) is the 
minimum s such that dG'{u,v) < s ■ dG(u,v) for all u,v G V (it suffices to 
consider only edge pairs {m,?;} G E). When sf(G',G) < s, we say that G' is an 
s-spanner in G. 

Given a graph G, a minor of G is a graph resulting from some sequence of edge 
deletion, vertex deletion, or edge contraction operations (denoted G — e, G — v, 
and G/e respectively). Since we only consider simple graphs, we discard self- 
loops and parallel edges. We say G has an i?-minor (denoted H < G) it G has a 
minor isomorphic to iJ. A hereditary graph property is a class P of graphs closed 
under isomorphism, such that whenever G is in P, so are its minors. In a series 
of papers, Robertson and Seymour show that every hereditary graph property 
is characterized by a finite set of forbidden minors (see 0 for an overview). 
The prime example is Kuratowski’s characterization of planar graphs by the 
forbidden minors {K^, K 3 3 }. 
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For a subset X of vertices in G, an X-flap is the vertex set of a connected 
component of G—X. Given a vertex-weighted graph G, a separator is a subgraph 
S such that every t/(S')-flap has weight at most w{G)/2. Note that separators 
are usually defined as just a vertex set, but in this paper we are interested in a 
tradeoff between 1{S) and #{S). 

Let V{G)-^ denote the collection of sets of at most k vertices in G. A haven 
of order k is a. function (3 assigning an X-flap to each X € V{G)-^, such that 
/3(^) C (}{X) whenever X C Y G V{G)-^. Given a vertex-weighted graph G 
and a non-separator vertex subset X, let j3‘^{X) denote the unique X-flap with 
weight exceeding w{G)/2. If X is a separator, let /?“'(X) = 0. Note that if G has 
no separator of size k, then /J™ (restricted to V{G)-^) is a haven of order k. 

3 A Well-Connected Separator 

Alon, Seymour, and Thomas Q give a polynomial time algorithm to find a 
separator in a graph with an excluded minor. Specifically, given as input a vertex- 
weighted graph G and a graph JI, their algorithm either finds an iL-minor in 
G, or it finds a separator in G with at most vertices, where h is the 

number of vertices in H . In particular, if we fix a non-trivial hereditary graph 
property V and only consider inputs G GV, then this algorithm finds separators 
of size 0(n^/^); this generalizes the planar separator theorem ITD]. 

They (and we) only consider the case H = Kh, since a X/j-minor implies an 
iL-minor. A covey is a forest C in G such that each pair of component trees is 
connected by an edge of G. A covey with #(C) = h witnesses a X/j-minor. 

In our application, G is also edge-weighted. We modify their algorithm to 
allow a trade-off between the total edge weight of the separator and the number 
of its connected components. We claim the following: 

Theorem 1. There is a polynomial time algorithm taking as input a vertex- 
weighted edge-weighted graph G, a positive integer h, and a positive real e < 1, 
and which produces as output either: 

(a) a Ki^ -minor in G, or 

(b) a separator S of G such that £{S) < s h £(G) and ff{S) < h? /e. 

We use much of their algorithm unchanged, so in this abstract we simply describe 
and analyze our changes. Our basic subroutine is the following slight modification 
of P Lemma 2.1]: 

Lemma 2. Let G be an edge-weighted graph with m edges, let be 

subsets ofV{G), and let e be a positive real number. There is a polynomial time 
algorithm which returns either: 

(i) a tree T in G such that £(T) < e £{G) and V (T) 0 0 for i = 1, . . . , k; or 

(ii) a set Z C V{G) such that jZj < (k — l)/e and no Z-flap intersects all of 
Ai, . . . , Afc. 

The proof of this lemma is essentially the same, except that we use a shortest 
paths tree rather than breadth first search. The rest of the proof is omitted. 
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The algorithm is iterative. After t steps of the algorithm, we have a sub- 
graph Xt and a covey Ct', initially Xq and Cq are empty. In step t of the algo- 
rithm, we halt if either > h or Af_i is a separator. Otherwise, we let 

Bt-i = /3™(At_i) and we invoke Lemma |2| on G[Bt-i], where the Ai’s are the 
neighborhoods of the component trees in Ct-i; we call this step either a T-step 
or a .^-step, depending on which is returned. The returned T or Z is then used 
to define X^ and Ct, according to several cases as described in p. We have these 
invariants: 

1. Xt is a, subgraph of Ct- 

2. For each component tree C in Ct, either Xt C\ C equals some T returned in 
a T-step, or At n C is a set of disconnected vertices contained in some Z 
returned in a Z-step. 

3. Bt C Bt-i; and if these are equal, then Xt C Xt-i- 

4. V{Ct) and Bt are disjoint. 

By the first invariant, Xt is the union of at most h parts of the form Xt fl C. 
By the second invariant and Lemma El each part has (-{Xt fl C) < £ • ({G) and 
#(At n C) < {k — l)/e. Therefore t{Xt) < h e £{G) and #(At) < h'^/e, as 
required in Theorem mb). 

By invariant 3 above, we see that the sequence of pairs (|i?t|, |At|) is lexico- 
graphically decreasing; therefore the algorithm halts after at most iterations. 
In fact an improved time analysis is possible, but we omit it here. 

Remark. Our algorithm (and the original) may also be useful in situations where 
we have a G with no iC/i-minor, but the vertex-weighting w is unknown. Observe 
that w affects the algorithm in only two ways. First, it can tell us when to 
stop because Xt is a separator. Second, when we update Xt-i, Bt-i splits into 
disjoint fiaps, and w tells us which flap to take as the next Bt- Since the Bt’s 
decrease, the tree of possible computations (depending on w) has at most n 
leaves. The tree depth is at most the maximum number of iterations, considered 
above. Therefore there is a polynomial size collection of vertex sets in G, such 
that for any weighing w, one of them is a separator satisfying the conditions of 
Theorem mb). 

4 The Span Algorithm 

Althofer et al. E] introduced the following greedy algorithm to find an s-spanner 
in an edge-weighted graph G. The parameter s is at least one, and dc'{a) denotes 
the length of the shortest path in G' connecting the endpoints of edge e (the 
length may be -l-oo): 

Span(G = {V,E),s): 

G' ^ (T 0) 

for all e G T in non-decreasing ( order do 
if s ■ £(e) < dc'ie) then add e to G' 
return G' 
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In the resulting G", we have dc'{e) < s ■ £(e) for every edge e £ E; therefore G' 
is an s-spanner. By comparison with Kruskal’s algorithm, we see that T(G) = 
Span(G, n — 1) is a minimum spanning forest in G, and Span(G, s) always con- 
tains T{G). The Span algorithm is idempotent in this sense: if G' = Span(G, s), 
then G' = Span(G', s). 

Define the tree weight of G' as tw(G') = £{G') / £{T{G')) (note that T{G') = 
T(G)). We seek a tradeoff between sf(G', G) and tw(G'). The algorithm has two 
extremes: Span(G, n—1) has tree weight one but may have stretch factor n — 1; 
Span(G, 1) has stretch factor one but may have tree weight nearly n^/4. For 
intermediate s, the following tradeoff is known |21 Thm. 2]: 

Theorem 3. // s > 1 and G is planar then tw(Span(G, s)) < 1 -I- 2/(s — 1). 

With s close to one, this theorem is a critical element of the approximation 
scheme for the TSP in weighted planar graphs Motivated by this application, 
we seek to extend the result to larger graph families. 

Definition 4. Suppose G is a graph, £ is an edge weighting in G, and T is a 
spanning forest. Define: 

1. gapf(e) = da-eie) ~£(e). 

2. gap(G, r) = maxf gap^(e))/^(T), where £ ranges over all edge weight- 
ings such that T = T{G). 

3. gap(G) = maxy gap(G, r), where T ranges over all spanning forests. 

4-. the graph class Q{k) = {G| gap(G) < fc}. 

Remark. Given G and T, gap(G, T) is the value of a linear program; suppose £ 
achieves the maximum. If e is a cut edge then gap^(e) is infinite, but it does not 
matter since e £ T (and we may set £{e) = 0). For other edges e we may assume 
gapf(e) > 0, because otherwise we could improve £ by setting £{e) = dc-ei^)- 

Theorem 5. Q{k) is a hereditary graph property. 

Proof. G{k) is easily closed under isomorphism; we need to show gap(F7) < 
gap(G) whenever H < G. Take £ and T' in H such that ga,p{H,T') = gap(F7). 
In G we will define an edge weighting (also denoted £) and a spanning forest T. 
By induction it suffices to consider these three cases: 

Case H = G — e: If e connects two components of G — e, let £{e) = 0 and include 
e in r so T — e = T'. Otherwise let £{e) = dc-ei^) and T = T'. 

Case H = G — v: By deleting edges first, we may assume v is isolated. Then £ 
is unchanged, and we add the isolated v to T. 

Case H = G/e: By deleting edges first, we may assume that no two edges in G 
merge to one in G/e. Let £{e) = 0 and include e in T so that T/e = T' . 

In all cases we have constructed £ and T such that (X)e^rS^Pf(®))/^(^) = 
gap(iL, T'), therefore gap(G) > gap(iL) as required. □ 

Let g{H) denote the girth of a graph H. By considering the uniform edge 
weighting £ = 1, we have: 
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Corollary 6. gap(G) > max/f(g(i7) — 2){\E{H)\/ {\V {H) \ — 1) — 1), where H 
ranges over all 2-connected minors of G. 

We now relate gap(G) to the Span algorithm. Let gap^(e) denote the edge 
gap in G', that is dc-eie) — i{e). 

Lemma 7. If G' = Span(G, s), then £(e) < l/(s — 1) ■ gap^(e) for all e in G' . 

Proof. We follow [3, Lem. 3] . Let P be the shortest path in G' — e connecting the 
endpoints of e. Just before the algorithm inserts the longest edge / € {e} U P, 
we have s ■ i{e) < s ■ i{f) < dc-fif) < £{P) = dc-e{^)- D 

Theorem 8. If s > 1 and G G G(k), then tw(Span(G, s)) < 1 + fc/(s — 1). 

Proof. We are given G with some weighting i. Let G' = Span(G, s); we need to 
show tw(G') < 1 + fc/(s — 1). Theorem El implies gap(G') < k. Let T = T{G'). 
By the definition of gap(G'), — k£{T). By the lemma we have 

— ^/('® ~ the result now follows. □ 

Remark. Theorem El is proved by showing C/(2) contains all planar graphs. Al- 
though not stated in this way, they construct a feasible point in a linear program 
dual to the definition of gap(G, T) (it is feasible even if we drop the T = T{G) 
constraint). 

LemmaEI implies that is a forbidden minor in G(h/2 — l — e); we conjecture 
a converse relation. 

Conjecture 9. There is a function /(•) such that G{f{h)) contains all graphs with 
no AT/j-minor. 

Absent this conjecture, we offer weaker evidence that Q{k) is interesting: 

Lemma 10. Suppose G has genus g; that is, G may he drawn without crossings 
on an orientable surface with g handles. Then G G — 4). 

Proof. (Sketch.) Suppose G is drawn in an orientable surface with g handles. 
Choose a spanning tree T such that gap(G,T) = gap(G). For edges e,f^T, 
say they are equivalent if the cycles in e -I- T and f + T are homotopic. If we 
take T and all the edges of one equivalence class, we get a planar subgraph 
of G. Suppose there are h equivalence classes; then G is the union of h planar 
subgraphs Gi , . . . , G/j with a common spanning tree T. By Definition 0 we see 
gap(G,T) < J2i=i gap(Gi,r), and this is at most 2h. 

It now suffices to show h < 6g — 2. We contract T to a point p, so the arcs 
become non-crossing non-homotopic loops based at p. We pick one loop of each 
class, and consider the faces defined by these h loops. We may assume that each 
face is a 2-cell, otherwise we could add another loop. Since no two loops are 
homotopic, no face has two sides. There may be one face bounded by one loop 
e, but then the other side of e has at least four sides; all other faces have at least 
three sides. Therefore 2e = I^I ^ 3/ — 1, where e is the number of loops, / 
is the number of faces, and |Z\| is the number of sides of face A. Combining this 
with Euler’s formula v — e + f = 2 — 2g gives our bound (here u = 1). A simple 
construction shows h = 6g — 2 is achievable for g > 1. □ 
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5 The Approximation Schemes 

We will reuse the methods introduced for planar graphs and only sketch 

them here. We are given as input a connected graph G, and a parameter e > 0. 
Our goal is to find a circuit in the graph visiting each vertex at least once, and 
with length within 1 + e times the minimum (this is equivalent to the original 
metric TSP formulation). The minimum lies between £{T{G)) and 2€(T(G)), so 
it suffices to find a solution with additive error at most e£{T{G)). We need to 
handle these two cases: 

Case G is unweighted and has no Kh-minor: We introduce the uniform edge 
weighting £=\. By Mader’s Theorem 8.1.1] there is a constant K such that 
£{G) < K£{T{G)) (the best K is 0{h.J^^) PI). 

Case G is weighted and in Q{k): We replace G by Span(G, 1 + e/4); this sub- 
stitution introduces at most {e/2)£{T{G)) additive error. Theorem 0 implies 
£{G) < K£{T{G)), where K = 1 + 4k/e. Also by Lemma Elwe know G contains 
no AT^-minor, for h > 2{k + 1 ). 

Now in either case we know that G has no AT^-minor, and that £{G) < K£{T). 
We now need to find a circuit within {e/2)£{T{G)) of optimal in time 
where the hidden constant depends on e, /i, and K. 

Given a separator S of G, it is easy to find a separation: that is a triple 
(S.Ai^A^) such that S' is a subgraph, Ai U A 2 = V{G), Ai n A 2 = P(S), there 
are no edges between Ai — S and A 2 — S, and each Ai — S has vertex weight at 
most (2/3)w{G). So by Theorem P we have: 

Corollary 11. Suppose G is an edge-weighted graph with £{G) < £{T{G)) and 
no Kh-minor, and <5 < 1 is a positive real number. Then there is a polynomial 
time algorithm finding a separation {S, Ai, Af) of G such that £(S) < 5 £{G) and 
#(S) < h^/5. 

We give G a uniform vertex-weighting w. We will build a linear size decom- 
position tree T of G, by repeated application of Corollary P] with the parameter 
5 = ye/ log n, where 7 > 0 is a constant to be determined. 

If a weighted graph F in T has less than 0{5~^) vertices, then it is a leaf. 
Otherwise, we apply Corollary to find a separation {Sp, Ai, A 2 ) in F. For 
each Ai we let Fi denote the graph that results from F[Ai] by contracting each 
component of Sp to a point; Fi and F 2 are the children of F in T. We call the 
new contracted points portal points, and give them (for now) zero weight. Note 
that Fi may also inherit portal points from F, and that each edge of F appears 
in at most one child. 

Since w{Fi) < 2/3w{F), the depth of T is O(logn). We introduce at most 
/ = h^/S new portals in each split, so every graph in T has at most p portals, 
where p = 0(/logn) = 0{h^{log^ n)/e). Since each original edge of G appears 
at most once in a level of T, the edges of all S p contracted in a single level have 
total weight at most S £{G). Summing over all levels, the total weight of all the 
contracted spanners is 0{e£{G)). By a suitable choice of 7 = 0{1/K), we may 
ensure that this is at most {e/A)£{T{G)). 
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Consider the optimum circuit r in G. After performing the splits and contrac- 
tions, r has an image Tp in each graph F of T; Tp enters and leaves F through 
its portals in some order, defining a sequence of portal-terminated paths within 
F, covering its vertices. Furthermore, by a simple patching argument ^ Lemma 
3.2], we may rearrange r (without changing its cost) so that each Tp uses each 
portal of F at most twice as an endpoint. 

Therefore, we are led to the following problem, which we will solve approx- 
imately by dynamic programming in the tree F- Given a graph F G F and an 
sequence cr of its portals where each portal appears at most twice in a, a a-tour 
of F is a sequence of paths covering F, with path endpoints as specified by the 
list a. For each cr, we want to find a near-optimal cr-tour in F. 

If F is a leaf in T, then we exactly solve each such problem in time, 

using the ordinary minor separator theorem P^. If F has children, then after 
we have solved all such problems for Fi and F 2 , we may solve them for F as 
follows. Consider all pairings of a cri-tour in Fi and a cr 2 -tour in F 2 ; if they are 
compatible, their paths patch together to give us some cr-tour in F. For each cr, 
we record the cheapest combination obtained; we then recover a true cr-tour in 
F by “uncontracting” the edges of Sp and charging each uncontracted edge at 
most twice. 

As shown above, the total weight of all these charged edges (over all of T) is 
at most (e/4)t'(T(G)), therefore the total additive error in these contributed by 
this uncontraction is {e/2)i{T{G)). We can show that this is the only source of 
additive error in our solution, so we have the promised approximation scheme. 

The time of the above algorithm is roughly the number of dynamic program- 
ming subproblems, which is With our previous bound for p, this is 

^0{(Kh^ /e)iogniogiogn) ^ better; by using the portal weighing 

scheme of 0, we can ensure that each graph in T has at most p = 6f portals, 
while T still has O(logn) depth. With this improvement, our time bound is 

^0{{Kh^/s) log log n) ^ 

6 Open Problems 

Of course proving Conjecture El would help unify our present results. 

We would prefer a true polynomial time approximation scheme, rather than 
quasi-polynomial. Our obstacle is the number of portal orderings cr that we must 
consider for each F. In the case of planar graphs 0, we overcame the obstacle 
by observing that we only needed to consider those cr-tours corresponding to 
non-crossing paths in an embedding of F on a sphere with 0(1) holes. This 
observation reduces the number of a considered to a simple exponential 
and consequently the total time is polynomial in n. In the present situation, a 
bound like the above is unknown, but it is at least plausible. This is because 
graphs with a forbidden minor are characterized in terms of blocks that can 
be “nearly drawn” on a 2-manifold with a bounded genus and number of holes. 

We would also like to solve the Steiner version of the problem, where along 
with G we are given a set of “terminal” vertices, and we want to find a minimum 
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length tour visiting all the terminals. The remark at the end of Section 0 is a 
preliminary step in that direction. 



Acknowledgment 

We thank Robin Thomas for help with Lemma C3 

References 

1. N. Alon, P. D. Seymour, and R. Thomas. A separator theorem for graphs with an 
excluded minor and its applications. In Proc. 22nd Symp. Theory of Computing, 
pages 293-299. Assoc. Comput. Mach., 1990. 

2. I. Althofer, G. Das, D. P. Dobkin, D. Joseph, and J. Soares. On sparse spanners 
of weighted graphs. Discrete Comput. Ceom., 9(1):81-100, 1993. An early version 
appeared in SWAT’90, LNCS V. 447. 

3. S. Arora. Polynomial time approximation schemes for Euclidean traveling salesman 
and other geometric problems. Journal of the ACM, 45(5):753-782, Sept. 1998. 

4. S. Arora, M. Grigni, D. Karger, P. Klein, and A. Woloszyn. A polynomial-time 
approximation scheme for weighted planar graph TSP. In Proceedings of the Ninth 
Annual ACM-SIAM Symposium on Discrete Algorithms, pages 33-41, San Fran- 
cisco, California, 25-27 Jan. 1998. 

5. S. Arora, M. Grigni, D. Karger, P. Klein, and A. Woloszyn. A polynomial-time 
approximation scheme for weighted planar graph TSP. In 9th Annual ACM-SIAM 
Symp. on Discrete Algorithms, pages 33-41, Jan. 1998. 

6. N. Christofides. Worst-case analysis of a new heuristic for the traveling salesman 
problem. In J. F. Traub, editor. Symposium on New Directions and Recent Results 
in Algorithms and Complexity, page 441, NY, 1976. Academic Press. Also CMU 
Tech. Report CS-93-13, 1976. 

7. R. Diestel. Graph theory. Springer- Verlag, New York, 1997. Translated from the 
1996 German original. 

8. M. Grigni, E. Koutsoupias, and C. Papadimitriou. An approximation scheme for 
planar graph TSP. In 36th Annual Symposium on Foundations of Computer Sci- 
ence (FOCS’95), pages 640-645, Los Alamitos, Oct. 1995. IEEE Computer Society 
Press. 

9. E. L. Lawler, J. K. Lenstra, A. H. G. Rinnooy Kan, and D. B. Shmoys. The 
Traveling Salesman Problem. Wiley, 1985, 1992. 

10. R. J. Lipton and R. E. Tarjan. A separator theorem for planar graphs. SIAM 
Journal on Applied Mathematics, 36:177-189, 1979. 

11. Mitchell. Guillotine subdivisions approximate polygonal subdivisions: A simple 
polynomial-time approximation scheme for geometric TSP, k-MST, and related 
problems. SICOMP: SIAM Journal on Computing, 28, 1999. 

12. C. H. Papadimitriou and M. Yannakakis. The traveling salesman problem with 
distances one and two. Mathematics of Operations Research, 18:1-11, 1993. 

13. N. Robertson and P. D. Seymour. Graph minors XVII: Excluding a non-planar 
graph. Submitted. 

14. A. Thomason. An extremal function for contractions of graphs. Math. Proc. 
Cambridge Philos. Soc., 95(2):261-265, 1984. 



Polynomial Time Approximation Schemes 
for General Multiprocessor Job Shop Scheduling 



Klaus Jansen^ and Lorant Porkolab^ 

^ Institut fiir Informatik und praktische Mathematik, Christian-Albrechts-Universitat 
zu Kiel, 24 098 Kiel, Germany, kj@informatik.uni-kiel.de 
^ Department of Computing, Imperial College of Science, Technology and Medicine, 
London SW7 2BZ, United Kingdom, porkolab@doc.ic.ac.uk 



Abstract. We study preemptive and non-preemptive versions of the 
general multiprocessor job shop scheduling problem: Given a set of n 
tasks each consisting of at most jj, ordered operations that can be pro- 
cessed on different (possibly all) subsets of m machines with different 
processing times, compute a schedule with minimum makespan where 
operations belonging to the same task have to be scheduled according to 
the specified order. We propose algorithms for this problem that com- 
pute approximate solutions of any positive e accuracy and run in 0{n) 
time for any fixed values of m, /r and e. These results include (as spe- 
cial cases) many recent developments on polynomial time approximation 
schemes for scheduling jobs on unrelated machines nn, multiprocessor 
tasks mm, and classical open, flow and job shops HMSI- 



1 Introduction 

We consider the multiprocessor job shop scheduling problem: Given a set of n 
independent tasks T = {T), . . . , T„} and a set of m machines M = m}. 

Each task Tj consists of < /i multiprocessor operations that 

have to be scheduled in the given order. For each operation Oji there is a set 
Viiji consisting of at most 2™ different modes, where each processing mode p G 
mji corresponds to a non-empty subset p C M of processors and specifies the 
operation’s execution time Pji{p) on that particular processor set. The objective 
is to minimize the makespan, i.e. the maximum completion time over all feasible 
schedules. 

We focus here on the large and non-trivial (NP-hard) subclass of the prob- 
lem, where both m and p are fixed. Following the standard notation scheme of 
the scheduling literature (see e.g. HE! ), the non-preemptive (preemptive) ver- 
sion of the latter problem is denoted by Jm\setij,op < p\Cmax {Jm\setij,op < 
p,pmtn\Cmax)- This problem can be viewed as a generalization of two well 
(but mainly independently) studied scheduling problems, job shop with p op- 
erations per job {Jm\op < p\Cmax) and general set-constrained multiprocessor 
task scheduling {Pm\setj\Cmax)- In classical job shop scheduling each operation 
is required to be processed on a single prespecified machine, so in terms of the 
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above formulation, for each operation there is only one processing mode corre- 
sponding to a single machine. In all of the different variants of multiprocessor 
task scheduling (dedicated, parallel, malleable, set-constrained), tasks are pro- 
cessed by subsets of processors, but there are no precedence relations specified for 
them. Therefore these can also be obtained (as special cases) from the previously 
defined general problem by requiring each task to consist of a single operation. 
Since both of the above special cases (jobshop and multiprocessor task schedul- 
ing) are NP-hard even if there are only a constant number of machines 1517191161 , 
it is natural to study how closely the optimum can be approximated by efficient 
algorithms. Focusing on the case where m and /r are fixed, we will show that 
there are linear time approximation schemes for the problem providing a unified 
extension of recent approximability results for both of the above special cases. 

Job shop scheduling is considered to be one of the most difficult problems 
in combinatorial optimization, both from the theoretical and practical points of 
view. Even very restricted versions of the problem are strongly NP-hard For 
those instances where m and fi are fixed (the restricted case we are focusing on 
in this paper), Shmoys et al. [II 8) gave approximation algorithms that compute 
(2 -|- e)-approximate solutions in polynomial time for any fixed e > 0. This result 
has recently been improved by Jansen et al. [ 1 41 1 who have shown that e- 
approximate solutions of the problem can be computed in linear time for any 
fixed e > 0. 

In classical scheduling theory, each task is processed by only one processor at 
a time. However recently, due to the rapid development of parallel computer sys- 
tems, new theoretical approaches have emerged to model scheduling on parallel 
architectures. One of these is scheduling multiprocessor tasks, see e.g. |6I7| . The 
general (in contrast to dedicated, malleable, parallel) variant of non-preemptive 
scheduling for independent multiprocessor tasks on a fixed number of proces- 
sors is denoted by Pm\setj\Cmax- Regarding the complexity, P3\setj\Cmax is 
strongly NP-hard |JI9| . thus already this restricted version has no fully polyno- 
mial approximation schemes, unless P=NP. For Pm\setj\Cmax, Bianco et al. 0 
presented an approximation algorithm whose approximation ratio is bounded 
by m. Later Chen and Lee ^ improved their algorithm by achieving an ap- 
proximation ratio ^ -|- e, for any e > 0. Until very recently, this was the best 
approximation result for the problem, and it was not known whether there is a 
polynomial-time approximation scheme or even a polynomial-time approxima- 
tion algorithm with an absolute constant approximation guarantee. Chen and 
Miranda jSj have proposed a dynamic programming based polynomial-time ap- 
proximation scheme for Pm\setj\Cmax whose running time is where the 

hidden constant in the big-0 notation is proportional with Indepen- 
dently from their work the authors have also proposed a polynomial time 

approximation scheme for Pm\setj\Cmax that computes an e-approximate so- 
lution in 0{n) time for any fixed positive accuracy e. The hidden constant in 
the previous bound - similarly as before - depends exponentially on the fixed 
parameters m and e, but here the constant appears as a multiplicative factor 
of n and not as an exponent. Hence this linear programming based approach 
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leads to a substantially better (worst-case theoretical) running time than the 
one in and also answers an open question of the latter paper by providing 
an polynomial time approximation scheme not based on dynamic programming. 

In this paper we integrate many of the above mentioned recent results mm, 
CTO that have shown the existence of polynomial-time approximation schemes 
for various shop and multiprocessor scheduling problems. We present linear- 
time approximation schemes for both Jm\setij, op < p\Cmax and Jin\setij, op < 
pL,pmtn\Cmax (under the assumption that m and p, are fixed), making these 
the currently known most general problems in shop and multiprocessor task 
scheduling with makespan minimization that posses polynomial-time approxi- 
mation schemes. Some of the previous results on job shop scheduling 

were based on a general vector summation result of Sevastianov. While this ap- 
plication of vector summation in scheduling is very interesting, it made the above 
papers highly dependent on the fairly involved underlying algorithm. 

Sevastianov’s algorithm computes a schedule for the job shop problem with 
makespan at most L^ax + P^'n^Pmax, where L^ax is the maximum load of op- 
erations assigned to any machine and Pmax = max^-^iP^i. The following simple 
example shows that we cannot use Sevastianov’s algorithm in the general mul- 
tiprocessor job shop model, in contrast to its applications in [ 1 1 II 1 41 1 . In these 

papers, it was applied only for operations with small Pmax value to obtain sched- 
ules of length at most (l + e)Lmax- Suppose that m = 3 and there are three types 
of tasks Ti, T 2 , T 3 each with /r = 1 and one mode per operation. Also assume that 
each task of type Tj requires processor set nj, where tti = {1, 2}, 7T2 = {2, 3} and 
7T3 = {1,3}. Consider instances where we have the same number n of tasks of 
each type and every operation is of length 3. Then the maximum load L^ax = 2n 
and the optimum makespan OPT — 3n. In this case, all operations are small 
and therefore no algorithm can achieve (1 -I- e)Lmax- 

Here we not only generalize and combine the previous results, but also show 
by introducing a new shifting procedure that the use of Sevastianov’s algorithm 
can be avoided making the algorithms and the whole presentation simpler and 
selfcontained. 



2 Non-preemptive Scheduling 

In this section we consider the non-preemptive version of the problem, where a 
task - once started - has to be completed without interruption. Thus a schedule 
consists of a processor assignment pji G rriji (i.e. one of the feasible processor 
sets) and a starting time Tji for each operation Oji such that at any time there is 
no processor assigned to more than one operation, the operations Oji, . . . , Ojfj, 
of task Tj are executed one after another, i.e. Tjt +Pji{Pji) < Tj,i+i for every 1 < 
i < p). The objective is to compute a non-preemptive schedule that minimizes 
the overall makespan C^ax = inax{Tj^ +Pj>(Pj>) : Tj G Tj. 




Job Shop Scheduling 881 



2.1 Snapshots, Relative Schedules, Blocks, and Configurations 

Let fc be a constant (depending on m, /i and e) that will be specified later. First, 
the algorithm will select a subset £ of fc long tasks with the largest values di > 
d 2 > ■ ■ ■ > dk and dk > dj for k + 1 < j < n where dj = [Pii(p)]- 

Let D = S = T \ C, M = and OPT the length of an 

optimum schedule. Then, we have D /m < OPT < D. By normalization (divide 
all execution times by D), we may assume without loss of generality that D = 1 
and that 1/m < OPT < 1. 

A processor assignment for £ is a mapping / : {Oji\Tj €£}—>■ 2^ with 
f{Oji) G rriji. Two operations Oji and Oj'i' for Tj,Tj> G £ are compatible if 
j yf j' and if they are processed on different machines (i.e. f{Oji)r\f{Oj'i>) = 0). 
For a given processor assignment of £, a snapshot of £ is a subset of compatible 
operations. A relative schedule R = (/, M(l), . . . , M(g)) of £ is a processor 
assignment / along with a sequence M{1), . . . ,M{g) of snapshots of £ such that: 
M(l) = M(g) = 0; each operation Oji of task Tj G £ occurs in a subsequence of 
consecutive snapshots M{aji), . . . 2 < aji < (3ji < g, (where M{aji) is 

the first and M{(iji) is the last snapshot that contains Oji); for operations Oji 
and Ojj+i of task Tj G £, (3ji < nyy+i; and consecutive snapshots are different. 
We observe that g can be bounded by 2kp,+ 1. For a snapshot M{£), let P{t) = 
f{Oji) be the set of processors that are used by operations from 
long tasks during snapshot M{£). 

Each of these snapshots M{£) will be divided inductively into a number of 
subintervals. For each £ = 1,. . . ,g, let te denote the length of snapshot M{£), 
where we may assume that each ti < 1. Let i5 < 1 be a constant that will be 
specified later. In the first step, we divide M{£) into \/5 subintervals of length 
5ti < 5. We call these subintervals blocks of depth 1. Next, each block of depth 1 
is divided into j subintervals of length 6'^t( < S^. These subintervals are called 
blocks of depth 2. We iterate this subdivision p, times and end up for each 
snapshot M{£) with l/i5^ blocks of depth p (each with length 5f^tg). Let M^{£) 
denote the set of all blocks of depth p in snapshot M{£). The total number 
of blocks of depth p is bounded by {2kp + 1)/^^. 

For each block q G Mf^{£), we consider a set of configurations: Let P{q) be 
the processor set used by operations of long jobs in block q (i.e. P{q) = P{£) 
for every block q in snapshot M{£)). Furthermore let Pqj, i = 1, . . . , Uq, denote 
the different partitions of F{q) = M \ P{q), and let Pq = {Pq,i, ■ ■ ■ , Pq,ng}- For 
each partition Pq i we introduce a variable Xqi to indicate the total length of 
Pqj where only processors of F{q) are executing short tasks and each subset of 
processors F G Pqj executes at most one operation of a short task in S at each 
time step in parallel. 

Let Lqp be the total processing time for all small operations in S executed on 
processor set p in block q. For each task Tj G 5 we use a set of decision variables 
Vjab G [0, 1] for vectors a G A and b £ B, where A = {a : 1 < Oi < 02 < . . . < 
dfi < (jfi} and B = {b : bi £ 2^, 6^ 0, i = 1, . . . , p}. Variable pjab = 1, if and 

only if each operation Ojk of task Tj is scheduled in block Ofc on processor set 
bk for each f < k < p; otherwise yjab represents the corresponding fraction of 
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task Tj's processing. For a given block q, processor set p, a £ A and b £ B, let 
^qp = {k ■ ^ < k < p,,ak = q,bk = p}- Then by using the previous notation, the 
load Lqp can be written as: Y.Tj^sY.a^A,b^B J2keK^^Pjk{p) Vjab- 

2.2 Linear Programming Formulation 

The linear program LP{R) for a given relative schedule R = (/, (M(l), . . . , M{g)) 
of £ is as follows: 

Minimize X]?=i 

(1) f^>0, e=l,...,g, 

(2) Effa,, te = VT, G £, * = 1, . . . , p, 

(3) Ujab ^ 0) VTj £ S, a £ A, b £ B, 

'I^aeA,beB Pjab = ^ 

(5) Xqi>0, yq£ Mp{e),i=l,...,nq, 

(6) El<i<n,|peP,,i ^9* — ^ P G 2^“ \ {0}, 

C^) ^Tj£S^a&A,b&B^k£K^^ yjabPjk(p) ^ Lqp, \/q £ M p(£) , p £ 2^ \{$} , 

(8) Xqi = 6He, yq£ M{£), £=!,.. .,g, 

(9) Vjab = 0, VTj G S with bk ^ rujk or bk n F(ofc) yf 0. 

In (9) we set some of the variables yjab to zero, since processors in P{ak) can 
not be used for operations of small tasks, and if bk ^ rrijk, then it is a non-feasible 
processor set for operation Ojk- 

Lemma 1. The optimum of LP{R) is not larger than the makespan of any 
schedule of T that respects the relative schedule R. 

Proof. Start with an optimum schedule S* for T that respects R. First, we show 
how to define the variables yjab for each small task Tj £ S. Assume that S* 
schedules each operation Ojk on processor set pjk G mjk (on a feasible processor 
set) for a time interval Pjk(pjk) that spans consecutive blocks q{sjk), ■ ■ ■ ,q{ejk) 
where Sjk might be equal to Cjk- Let fjk{i) be the fraction of operation Ojk that 
is scheduled in block q{i) and define bk = Pjk for 1 < k < p. Let b = (bi, . . . , bp), 
Sj = {sji , . . . , Sjp), and assign values to the variables Pj^i as follows: Set Pjafi = 
r, where r = irmi{fjk{sjk) ■ f "£ k < p}. To cover the remaining 1 — r fraction of 
each operation, we assign values to the other variables. To do this, we set 
fjk(sjk) = fjk(sjk) — r. For at least one operation Ojk the new value fjk{sjk) is 
zero; for those operations with fjk{sjk) = 0 we set Sjk = Sjk + f- Then, we assign 
values to the new variable yjg.i as above and repeat the procedure until r = 0. 
Each iteration of this procedure assigns a value to a different variable, since from 
one iteration to the next at least one block Sjk is changed. The assignment of 
values to variables yjab generates the load Lqp of small operations for each block 
q and subset p. Now consider a block q in snapshot M{£) of length S^ti. The 
optimum schedule S* gives us directly the partitions Pqj used inside block q 
and the corresponding lengths Xqi. The assignment of values to these variables 
satisfies the constraints of the linear program LP{R) where the total length 
equal to the makespan of S*. Therefore, the optimum objective value 
of LP{R) is less than or equal to the makespan of S'*. □ 
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Let OPTr be the length of an optimum schedule that respects relative sched- 
ule R. Now we describe how to compute an approximate solution of LP{R). First 
we guess the length s of an optimum schedule and add the constraint 

(0) ELi ^ 

to LP{R). Then, we replace (6) and (7) with the following constraints (one for 
each block q and subset p): 

(6-7) Ujab ~ Sl<i<n,|pGP,,i ^9* + ^ — 

The new linear program denoted by LP(R, s, A) has a special block angular 
structure. The blocks Bj = {yj : yjab > 0,X]a bVia-b ~ 1} 7} G S are simpli- 

cies (of fixed dimension), and i?|s|+i = {{t^,Xqi)\ conditions (0), (1), (2), (5), (8)} 
is a block with only a constant number of variables and constraints. The cou- 
pling constraints are the inequalities (6-7). Note that for each q and p, the 
function fqp = J2a^A,b&B Pkj(p) Vjab ~ J2l<i<ng\p^Pg^i + 1 IS 

non-negative over the blocks, since X)i<i<n |pgp i ^ 9 * — < 1- 

The Logarithmic Potential Price Directive Decomposition Method developed 
by Grigoriadis and Khachiyan jSj can be used to get an e relaxed decision pro- 
cedure for LP{R, s, A). This procedure either determines that LP{R, s, 1) is in- 
feasible, or computes a feasible solution of LP{R, s, 1 -I- e). For any fixed m and 
e > 0, the overall running time of the procedure is 0(n) . We disregard the relative 
schedule R, if LP{R, 1, 1) is infeasible. Using binary search on s G [1/m, 1] we can 
compute in a constant number 0(log ™) of iterations a value s < OPTn{l + |) 
such that LP{R, s, 1 -I- e) is feasible. 

2.3 Rounding 

In this subsection, we show how to generate a feasible schedule using an approxi- 
mate solution of the previously defined linear program. For the solution obtained 
after the binary search on s, let e,p = Et,gS EaGA.hGS Pkjip) vla,b ~ 

12i<i<n |pGP i inequalities (6 — 7) imply that for any block q and non- 
empty subset p, p < e. If ig^p < 0, then we have nothing to do for the pair (g, p). 

Let Lq^p = X)i<i<n,|pGP, i ^q,i space for small tasks that use processor 

set p. In the first step, we shift a subset S C S of small tasks to the end of the 
schedule such that J2TgeS\sJ2aeA,beBJ2keKf^Pkj(p)yla,b < L,,p. Then, the 
subset S \ S of remaining small tasks fits into the free space for the p-processor 
tasks. Notice that this step is not sufficient to generate a feasible schedule. In a 
second phase, we use a new technique to transform the schedule into a feasible 
one. In the following, we show how to compute S in linear time for any fixed 
m. First, we modify the p-components of the solution. The p-components of 
the solution of LP{R, s, 1 -|- e) can be considered as fractional assignments. The 
lengths of p are defined as Z,,p = Et,-g 5 EaG. 4 .bGB J^keK-^^Pkjip) vla,b^ for 

each block q and subset p. For every q and p, we have Lq^p < Lq^p + e. Consider 
now the following system: 



884 



K. Jansen and L. Porkolab 



(^) Vjab ^ Oj VTj G S, a (z A, b G B, 

(^) 'l2a&A,b&B Voa-b = Ij VTj G S, 

(c) X)ae^,6eB Vjab = Lq^p, 'iq G Mp(£), p G 2'“\{0}. 

This system can be written also in matrix form Cz = c, z > 0. Using linear 
programming we can obtain a solution in polynomial time with only a constant 
number of fractional assignments. We give now a rounding technique that needs 
only linear time. Let us assume that the columns of C are indexed such that 
the columns corresponding to variables yjab for each task Tj appear in adjacent 
positions. First, we may assume that variables yjab with zero value and the 
corresponding columns are removed from z and C, respectively. We will show 
that there exists a constant size subset C of these columns in which the number 
of non-zero rows is smaller than the number of columns. The non-zero entries 
of C induce a singular matrix of constant size. Therefore, we can find a non- 
zero vector z' with Cz' = 0 in constant time. Let 7 > 0 be the smallest value 
such that at least one component of the vector z -I- ^z' is either zero or one (if 
necessary we augment z' by adding some zero entries) . 

We assume that each task Tj (during the procedure) has at least two columns 
in C. If task Tj has only one column in C, then the corresponding variable yjab 
must have value 1. The number of inequalities of type (c) is bounded by the 
constant K = (2™ — l){2kp,+ l)/<5'' [the first factor is the number of non-empty 
subsets and the second the number of blocks] . Let C be the set formed by the 
first 2K + 2 columns of C. We note that at most 2K + 1 rows of C have non- 
zero entries. By the above assumption on the number of columns for each job, 
at most iL -I- 1 of these entries come from the constraints ( 6 ) and at most K 
non-zero entries come from the constraints (c). Let C be the submatrix of C 
induced by the non-zero rows. Since C' has at most 2K + 1 rows and exactly 
2K + 2 columns, the matrix C is singular and there is a non-zero vector z' such 
that C' z' = 0. Using linear algebra and the constant size of C", such a vector z' 
can be found in constant time. We can repeat this procedure until there are at 
most 2K + 1 columns in C. Therefore, the total number of iterations is at most 
jiSj • — 2K — 1 and each iteration can be done in constant time. By the above 

argument, there are at most 2K + 1 variables yjab that may have values different 
from 0 or 1. Since at the end of the rounding procedure each task has either 0 
or at least 2 columns in C, at most K tasks receive fractional assignments. 

Lemma 2. For the set T of small tasks with fractional assignments after the 
rounding procedure, it holds that \T\ < K. 

Let G A and h^d) g b the unique vectors such that yjaU)bU) = 1 for 
Tj G S \ T . In the next step, we compute for every non-empty subset p C M 
and block q with e,p > 0 a subset Sqp C S \ T of tasks such that the total 
execution length Xi<fe<p|a0)=q,60)^p Pkj{p) > £qp, and there is one 

task G Sqp for which XT,eSpp\{T,(p.p)} Si<fe<M|aO)=g.6«=P Pkj(p) < ^<iP- 

Let U = {Tj(^q^p') \ block g, subset p}. In total, we get a set ZY U .7^ of tasks with 
cardinality at most 2K, and a subset V = (IJg p Sqp) \U of total execution length 
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at most p \ f-q,p\ < By choosing e = the total execution length of V 

can be bounded by < jOPT. Using that s < (1 + e/8)OPTR this implies: 

Lemma 3. The objective function value of the computed linear program solution 
restricted to P' = P \ (U U P) is at most OPTr + ^OPT, and \IA\J P\ < 2K. 

In the next step, we have to show how to compute a feasible schedule for the 
tasks in T" = P\{UUVUP). The other tasks in fUUVUP) will be scheduled 
afterwards at the end of the schedule. 



2.4 Shifting Procedure 

The main difficulty is the case with more than one operations per task inside 
of a snapshot or a block. In this case, the algorithm in the last step might 
generate infeasibilities between these operations. Therefore, we introduce and 
use a shifting procedure for each snapshot to avoid this difficulty. Recall that we 
have a hierarchy of blocks of different depths (i.e. a block of depth i consists of 
1/(5 blocks of depth i + 1). 

Consider one snapshot M{£), I G ,g}. Let seq = {seti , . . . , set^) be a 

fixed sequence of processor sets seU C M , sett yf 0 for 1 < i < /i. Furthermore, 
let Pseq C p” be the set of small tasks Tj where operation Oji is scheduled on 
processor set seti for 1 < 5 < ^. For each task Tj € P" such a sequence can 
be derived from the LP solution. Finally, let Pseq{£) C T/eq be the subset of 
tasks Tj where at least one operation of the task is assigned to a block inside of 
snapshot M{£), I < £ < g. Since the LP has generated for each Tj G Pseq{£), a 
unique vector . . . , such that < ■ ■ ■ < d^£i \ for each Tj G Pseq{£), a 
sequence of consecutive operations O . a),. .. ,0 . a) lies inside snapshot M {£). 
In the shifting procedure, we modify the assignments of blocks only for these 
operations. 

For each snapshot M{£), we insert different blocks of different depths. All 
inserted blocks will have the same set of free processors as M{£). We add one 
block of depth 1 to the left side of the snapshot. The length of this block will be 
Ste- Next we add one block of depth 2 with length S'^tg to the left side of each 
original block of depth 1. We repeat this process until depth pi—1. In iteration i, 
I < f < /i — 1, we have inserted blocks of depth i. The total enlargement 

of the whole interval can be bounded by YHt=i = Yl!i=i 

Next we apply our shifting procedure for the first /r — 1 operations. The key 
idea is to shift the zth operation at least one block of depth i to the left but 
not into one of the blocks of depth z — 1 to the left side. Notice that we do 
not consider inserted block of depth 1, . . . , z — 1, z + 1, . . . , /i — 1 in iteration i. 
In the first iteration, we consider all 1st operations of tasks Tj G Pseq{£) with 
s^/^ = 1. We start with the leftmost original block q of depth /i in M{£) and 
shift all operations Oji executed on seti into the first inserted block qnew{^, 1) 
to the left side. To do this, we reduce the length Lg^seti and store the free 
space (j) pji(seti) in block q. The free space in block qnewi^, 1) 
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is now Sti — y]™ (j)_. pji(seti). Then, we go over to the next original 

j fc / seg V^/ 

block to the right side and shift again the corresponding operations into block 
c/new(l, !)• We repeat this procedure until the block qnew(l, 1) is completely used 
corresponding seti. At this time, we may interrupt one operation and shift the 
remaining part of this operation and the next operations into the first original 
block of depth p with some free space. During the shifting we have to update 
the free spaces in the blocks. Notice that we do not use inserted blocks of depth 
2, ...,/r- 1. 

After the first iteration, every 1st operation of a task Tj G Tseq{t) (with 
s^P = 1) is shifted at least one block of depth 1 to the left. On the other hand, 
several operation are preempted but the lengths Lq^seti are not increased. The 
total number of preemptions can be bounded by 1 + 1/(5^. Now we go over to 
the second operations of tasks Tj € Tseq{() [only tasks with sP < 2 < eP 
are considered]. Again, we start with the leftmost original block q of depth 
fj, and shift all operations Oj 2 executed on set 2 into the first inserted block 
g„eu)(2, 1) of depth 2. Again, we reduce the length Lq^set^ and store the free 
space generated by the processing times Pj 2 {set 2 ) of all shifted operations Oj 2 - 
Again, we iteratively consider all blocks of depth p inside the leftmost block of 
depth 1. Similar to above, the procedure shifts the second operations at least 
one block of depth 2 to the side. After considering the last block inside of the 
leftmost block of depth 1, we go over to the second block of depth 1. But now 
we shift the 2nd operations into the second inserted block g„e„(2, 2) of depth 
2. Notice that we do not shift operations into the neighbour block (to the left 
side) of depth 1. Using this idea, we eliminate infeasibilities between the first and 
second operations and we obtain different assigned blocks. Again, we have not 
increased the original values Lq^set-z but some operations are preempted (during 
the procedure). The number of preemptions in the 2nd iteration is bounded by 
l/5+l/5^^. After p—1 iterations, we have p — P preemptions 

(using that p <\ and 5 < 1). Among all snapshots M{£), this gives us at most 
{2kp + ^)p preempted operations and an enlargement of the schedule of at 
most p50PTji{l + €/S). We apply this shifting procedure for all job types. Since 
there are at most (2™ — 1)'' different sequences of non-empty subsets of M, the 
number of preemptions is at most (2kp+ 1)-^^ — , and the total enlargement 
is bounded by /n5(2™ - 1 )^OPTr( 1 -k e/8). 

Let X be the set of small tasks that are preempted during the shifting proce- 
dure and let T"' = T” \ X. The tasks in X will be scheduled again at the end of 
the schedule. During the shifting procedure, we update the assignments of the 
blocks to the operations. Let {Pp , . . . ,dP) be the generated new sequence of 
blocks. The main result of the shifting procedure is that a) < ... < a/ . Re- 
member that the number of blocks is increased by — P=t ^or each 

snapshot. On the other hand, operations (of tasks in T'”) that are completely 
assigned to a new block can be executed one after another without preemptions 
and without generating conflicts [they have all the same processor set seti in the 
new block]. This implies that in the final step we have to generate a schedule 
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(or pseudo schedule PS{B)) only for the original blocks of depth If we define 
S = I the total enlargement is bounded by |(1 + ^)OPTfi. 

2.5 Computation of Pseudo Schedules 

In the final phase, the algorithm computes a pseudo schedule PS{q) for each 
block q. We consider in this phase only the tasks in T”' . The tasks in T'” and 
their operations are assigned unique to blocks. For each block q and non-empty 
subset p C M, we have a set of operations 

O,, = {0,k\Tj G r'", 4^') = q, = p}. 

The set Oqp will be scheduled in block q on subset p, where p C F{q). Let 
OjfcGOgp Pjk{p), the total processing time of operations 

in Oqp. Using the steps before, we know that Lqp < Lqp = Yl,i<i<n ,psP i — 
Sk- ■ t£. From the left to the right, we place now the operations of Oqp on the 
free processors in F{B). To do this, we consider each partition with value 
a;* j > 0. For each set p in the partition Pqp^ we place a sequence of operations 
that use processor set p with total execution length x* j. If necessary, the last (and 
first) operation assigned to p is preempted. Since pgp i t — 

procedure completely schedules all operations in Oqp for every p C F(g), and it 
runs in 0(n) time. Let Wg be the set of preempted (and therefore incorrectly 
scheduled) tasks in PS{q), and let W = UqWq. The following lemma gives an 
upper bound on the cardinality of W. 

Lemma 4. |W| < m™+^(2fc/i -|- l)/i5^. 

We remove the tasks in W of tasks and execute them at the end of the 
schedule. This gives a feasible schedule of length at most OPTji{l+ ^) + S{UU 
P U W U ff), where 5(3^) denotes the length of subset y CT, where each task 
Tj G 3^ is executed on the fastest processor set corresponding to the dj value. In 
the following, we will bound the length of W LLF U W U df. The previous bounds 
also imply the following inequality: 

Corollary 1. \U\JF\JWyjX\ < k. 

Finally, we have to choose k such that the removed tasks can be scheduled 
with length at most < jOPT. This can be achieved by a combinatorial 
lemma proved by the authors in m This concludes the proof of our main 
result: 

Theorem 1. For any fixed m and p,, there is an approximation scheme for 
Jm\setij,op < p\Cmax (the general multiprocessor job shop scheduling problem) 
that for any fixed e > 0 computes an e-approximate schedule in 0(n) time. 

Recent results showing the existence of (linear-time) approximation schemes 
for Jm\op < p\Cjnax (job shop problem) in |14llbj . Pm\fiXj\Cmax (dedicated 
machines) in [IJ, Pm\sizej\Cmax (parallel scheduling) in fH, Prn\fctnj\Cmax 
(malleable tasks) in [I^, Pm\setj\Cmax in jbllb) and Rm\\Cmax (unrelated par- 
allel machines) in [Ej can all be obtained as special cases of Theorem [D 
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3 Preemptive Scheduling 

In the preemptive model of scheduling, the processing of any job is allowed to be 
interrupted any time at no cost and restarted later. Two variants of this model 
can be considered according to the possibility of changing processor sets after 
preemptions. Preemptive scheduling without migration requires each operations 
to be processed by the same processor set during the whole schedule, i.e. if the 
operation is preempted, later it has to be resumed on the same set of processors. 
Preemptive scheduling with migration allows preempted operations to be later 
resumed on a different subset of processors than the one(s) used before. 

In both variants we use fixed sequences of start- and end-events for opera- 
tions of long jobs (to allow preemptions). For the model without migration, we 
fix in the relative schedule the assignment of processor sets for all operations of 
long jobs, but we use the same type of variables yjab for both long and small 
jobs. For schedules with possible migration, we also have to allow the change 
of processor sets for operations of long jobs. To do this, we split each snapshot 
into 2™ intervals corresponding to the different subsets of M such that during 
interval Ip only processors in the corresponding processor set Pp can be used for 
long operations, and all the other processors in M \ Pp are used for small opera- 
tions. Furthermore, we partition each of these intervals into different subintervals 
corresponding to the different mappings / of processors to operations that map 
the processors in Pp to operations of long jobs that are feasible for snapshot £. 
By using slightly more complicated linear programming formulations and then 
applying the shifting procedure, the following results can be shown. (The proof 
along with the complete descriptions of the linear programming formulations 
will be given in the full version of this paper.) 

Theorem 2. For any fixed m and p,, there is a polynomial-time approximation 
scheme for Jm\setij,op < p,pmtn\Cmax (both with and without migration) that 
for any fixed e > 0 computes an e-approximate schedule in 0{n) time. 

4 Conclusion 

In this paper we have proposed linear-time approximation schemes for preemp- 
tive and non-preemptive variants of the general multiprocessor job shop schedul- 
ing problem, showing that this is the most general currently known problem 
in shop and multiprocessor task scheduling with makespan minimization that 
posses polynomial time approximation schemes. This result also provides inte- 
grations (extensions) of many recent results in both job shop 1 1 4| 1 and multi- 
processor task scheduling mm\- We have introduced a general shifting pro- 
cedure which is an essential component of the algorithms above and will likely 
to find further applications in scheduling. This new procedure also allowed us 
to avoid using the quite involved vector summation algorithm of Sevastianov 
that is a key ingredient in the previously proposed approximation schemes for 
job shop scheduling (M, making the algorithms and presentation simpler and 
self-contained. 
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Abstract. First-order translations have recently been characterized as 
the maps computed by aperiodic single-valued nondeterministic finite 
transducers (NFTs). It is shown here that this characterization lifts 
to “V-translations” and “V-single-valued-NFTs” , where V is an arbi- 
trary monoid pseudovariety. More strikingly, 2- way V-machines are in- 
troduced, and the following three models are shown exactly equivalent 
to Eilenberg’s classical notion of a bimachine when V is a group variety 
or when V is the variety of aperiodic monoids: V-translations, V-single- 
valued-NFTs and 2-way V-transducers. 



1 Introduction 

The regular languages have been characterized in many ways, in particular us- 
ing DFAs, NFAs, and 2-way DFAs Ih{b59lbhe59l . monadic second order logic 
ffihcbOITrafilj and finite semigroups [IFil76hj with an incisive algebraic parame- 
trization arising from the latter. For instance, aperiodic DFAs recognize exactly 
the star-free languages jSS, and these are precisely captured by FO, i.e. 
first-order logic with order fMFTTI . 

In a separate vein, renewed interest in FO was sparked by the development 
of descriptive complexity (see !Tmm99| h and the old concept of a translation 
became the natural way to reduce one problem to another. FO-translations and 
projections, in particular, became the lowest level reductions used in the theory. 

Experience with the regular languages would suggest characterizing FO- 
translations using aperiodic deterministic finite transducers. But this fails, be- 
cause such transducers cannot even map W\W 2 - ■ -Wn to wjj. Lautemann and 
three of the present authors [TjMSV 99| showed that the appropriate automata- 
theoretic model required to capture FO-translations is the aperiodic nondeter- 
ministic transducer, restricted to output a unique string on any input. 

* Supported by NSERG of Ganada and by FGAR du Quebec. 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. SflO- BiTil 2000. 

© Springer- Verlag Berlin Heidelberg 2000 
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Here we extend this result in several directions. We first generalize from FO- 
translations to V-translations, where V is an arbitrary monoid variety (a monoid 
variety is the most convincing notion imaginable of a natural set of monoids). 
FO-translations are obtained when V is the variety of aperiodic monoids, and 
other special cases of such V-logical formulas were studied before, for example 
when 3 and V quantifiers were replaced by MOD^ quantifiers in the work of 

Second, we define an NFA to be a V-NFA if applying the subset construction 
to it yields (the familiar notion of) a V-DFA. 

Third, we consider 2- way DFAs, and define what constitutes a 2- way V- 
machine. This is delicate because the appropriate definition of the extended 
transition function of such an automaton is not obvious, which in fact explains 
the absence of an algebraic treatment of 2- way DFAs in the literature. 

Our main result is a striking equivalence between the above notions and the 
old notion of a bimachine developed by Eilenberg Pil76al p. 320]: 

Theorem 1. Let f: S* — >■ F* , let V be the variety A of aperiodic monoids or 
an arbitrary variety of groups. The following statements are equivalent: 

(i) f is a Y -translation. 

(ii) There is a sequential 'V -bimachine that computes f. 

(Hi) There is a single-valued nondeterministic Y -transducer that computes f. 
(iv) There is a 2-way Y -machine that computes f. 

Moreover, for the much larger class of varieties V closed under reversal, (i) 

(ii) (Hi). 

In the case VA, Theorem Q] states that FO-translations, aperiodic bima- 
chines, single-valued nondeterministic aperiodic transducers, and deterministic 
aperiodic 2-way transducers, all compute exactly the same functions. 

Intuitively, the main ingredients of Theorem Q] are that NFAs can simulate 
bimachines by guessing the behavior of the second machine on the unread part 
V of an input string uv (and they can simulate FO by guessing the possible 
sets of inequivalent formulas consistent with u satisfied by uv). The link with 
2- way DFAs, in the non-group case, comes from simulating NFAs in the slick way 
developed by Hopcroft and Ullman |H 1 ) B7j , yet a considerable amount of care is 
needed to ensure that the simulation preserves the required algebraic properties. 
(In this paper, we limit our claim to the statement of Theorem ^ although we 
suspect that the equivalence between the four models of the theorem holds for 
many more varieties.) 

Section 0 contains preliminaries and the precise definitions of our computa- 
tion models. Section 0 presenting our results, first addresses the models equiva- 
lent for any V closed under reversal, and then we bring in the 2-way model. Due 
to space restrictions, most proofs have to remain sketchy or are even omitted 
from this abstract. For full proofs we refer the reader to fIVIS TVDlTj . 
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2 Definition of Models and Preliminary Results 

An associative binary operation on a set containing an identity for this operation 
defines a monoid. By a monoid variety, we mean a pseudovariety in the sense of 
Straubing [SEnH pp. 72ff] or Eilenberg jEil7tib| pp. 109ff]: it is a set of monoids 
closed under the finite direct product, the taking of submonoids and of homo- 
morphic images. Examples of varieties include the commutative monoids, the 
groups G, the aperiodics A (i.e., the monoids containing only trivial groups), 
the solvable groups Gsoi, or any set of monoids satisfying a set of identities, as 
we explain next. 

Let U{ui, U 2 , Ms, . . . } be a countable alphabet of variables, and let /, r € U* . 
Then an equation of the form Ir is a monoid identity. A monoid M satisfies 
the above identity if, for every homomorphism h: U* ^ M, we have h{l)h{r). 
Monoid identities are used to define pseudovarieties of monoids, we refer the 
reader to the excellent presentation in |Str941 Chapter V.6]. Precisely, a set V of 
monoids is a variety, if and only if there exists a sequence (liri)i > 1 of equations 
such that a monoid M belongs to V iff it satisfies all but finitely many equations 
liTi, see |Eil7fibl Sect. V.2]. 

If M is a finite automaton (one-way or two-way) such that the transformation 
monoid of M satisfies the identity Ir, then we say that Ir holds in M. 

Deterministic finite automata are, as usual [HU79L p. 17], given by M{S, E, S, 
s,F), where the different components denote the state set, the input alphabet, 
the transition function, the initial state, and the set of final states, respectively. 
The extended transition function |LLU79I p. 17] of M is denoted by 6. The trans- 
formation monoid of M is the set { 5{-, w): S ^ S | w G A*} with the operation 
of composition of functions. We say that M is a V-DFA if its transformation 
monoid is in V. A V-language is a language accepted by a V-DFA. 

An equivalence relation ~ on S* is a eongruence ii u ^ v implies (Vx, y G 
E*)[xuy ~ xvy]. A congruence ~ induces a monoid isomorphic to the 

transformation monoid of the pre-automaton A, ([m],^, a) i— >■ [mo],v.,). An 

example of a congruence is the syntactic congruence of a language L: m u 
iff (ffx,y G E*)[xuy G L iS xvy G L]. Let a DFA M{S, E,S, s, F) be given. 
Another example of a congruence is: u v iff S{-,u)S{-,v). Then E*/^m is 
isomorphic to the transformation monoid of M. Writing for the syntactic 
congruence of the language L{p, g){w | S{p, ?/;)(?}, there is an injective morphism 
E*!^m — > Flp^g^si^*/ ^pq) and a surjective morphism E */ — > E*j ^pq. 
These facts can be shown to imply that M is a V-DFA iff L{p, q) is a V-language 
for each p,q G S. 



2.1 FO-Translations and V-Translations 

We consider first-order logic with linear order, denoted by FO (in the literature, 
this is often denoted FO[<], see, e.g., Pmm99jL We restrict our attention to 
string signatures, i. e., signatures of the form {Ca ^, . ■ . , Caf), where all the pred- 
icates Coj are unary, and in every structure A, A \= Cai{j) iff the jth. symbol 
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in the input is the letter a^. Such structures are thus words over the alphabet 
S{ai, . . . , Qg}, and first-order variables range over positions within such a word, 
i. e. from 1 to the word length n. A formula from this logic is called a formula 
over E. Our basic formulas are built from variables in the usual way, using the 
Boolean connectives {A,V,-i}, the relevant predicates Ca^ together with {, <}, 
the quantifiers {3,V}, and parentheses. 

Let A be as above, and consider a second alphabet r{bi, . . . ,bt+i}. Fix an 
order &i < • • • < bt+i on F. Let (pi, . . . ,(ft be first-order formulas over E, each 
with one free variable x. These formulas define a mapping [t^i, . . . , ipt] ■ E* 

as follows: Let wwi • • • S E^. Then, [ipi , . . . , (pt]{w)vi • • • G T", where 



{ bi ifw^<pi(f), 

62 if w \= -•(piii) A(p2{i), 
bt+i if w ^ A -•ip2{i) A • • • A 



for 1 < i < n. In this definition, is the formula that is obtained when in 

ipj variable x takes value i. 

This function is called a FO-translation or first-order definable translation, 
see !LMSV99| . (In the more general case, where the formulas are 

allowed to have more than one free variable, these functions are called first-order 
reductions or first-order interpretations, see Pmm99! .l 

Now let E and F be as above and A C F* . Then we say that Qa^Ipi, . . . , 
is a Q'X'FO formula (over E). For w £ E* , we define w ^ QAx[pl^ . . . if 
G A. 

In this paper, the sets A defining quantifiers will most of the time be monoid 
word problems. For a monoid M and a subset F C M, we define the word problem 
W (M, F) as the language {mi- - ■ mk G M* \ fc G N, mi o • • • o m^ G F }, where 
o denotes multiplication in M . 

Let V be a pseudovariety of monoids. We will now use V word problems to 
define translations as follows. First, define V-formulas inductively by: 

— Every quantifier- free formula is a V-formula. 

— If V? is quantifier- free, x a variable, M G V, and F C M, then Qw(m,f)X p 
is a V-formula. 

— Every Boolean combination of V-formulas is a V-formula. 

A Y -translation is a translation [pi , . . . , (^(], where pi, . . . ,ipt are V-formulas 
with one free variable. 

Examples. The quantifier 3 is the quantifier Qw{orSA 1 i OR is the 

aperiodic monoid defined by the binary OR. Straubing jStr94j surveys an elegant 
theory in which FO is supplemented with “monoidal” quantifiers such as MOD,j, 
where MOD^a:: p holds iff p{x) holds in a multiple of q word positions x. Such 
quantifiers are Qw(z,,{o}) quantifiers. Hence [Qw{Zg,{o})y(y < x A Cay)] is a 
translation, mapping w\W 2 - - - Wn G {a, 6, c}* to Vi---Vn G {0,1}", such that 
vA iff wi - - - Wi-i contains a multiple of q occurrences of a. Other monoidal 
quantifiers, e.g., QwiSa,{e}) where S 5 is the (nonsolvable) symmetric group on 
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5 points (see, e.g., insnoi), have been used in the literature to capture the 
complexity class NC^ using FO. 

2.2 Bimachines and Bisequential Functions 

Eilenberg plilTfial p. 320] defined a bimachine to be a triple M{Mi, M2, g) 
where S,6i,si), M2{S2, S ,82, S2) are DFAs without final states, and 

g: Si X S2 X S ^ r. For i € {1 , . . . , jwj}, the output of M on input wwi ■ ■ ■ Wn 
at position i is om{w, i)g{Si{si,wi ■ ■ ■ Wi-i), S2{s2, Wn ■ ■ ■ Wi+i),Wi). The output 
of M on input w is Tm{w)om{w,^) ■ ■ ■ OM{w,n). M is a V-bimachine if the 
transformation monoids of both M\ and M2 are in V. 

We define a bisequential function as a pair /(a, m), where a is a congruence 
over some alphabet S, and m is a function m: S*/axU*/axS^r for some 
alphabet F, and for all wwi ■ ■ - Wn £ S* , f{w)vi • • • where Vim(\wi ■ ■ ■ Wi-i]a, 
[wi+i • • • Wn\a, Wi) . We say that / is a V-bisequential function if the quotient set 
E* ! a with the usual multiplication is in V. 

Observe that if V is closed under reversal, a function is V-bisequential 
if and only if it is computed by some V-bimachine. Indeed, to compute the 
bisequential function f{a,m), define M{Mi, M2,m), where A, ^i, si) 

and M2{S2, E, 62,82) are given by S1S2E* /a, siS2[e]c(, ([w]a, a) [wajc and 
<52([w]a. a)[aw\a for sE a £ E, w £ E* . The reader may check that /Tm and 
that the transformation monoids of M\ and M2 are in V. 

The converse, given M to find a V-bisequential function / such that TMf, 
is proved in a very similar way. 

2.3 Single- Valued Nondeterministic Transducers 

In fLMSV99| . a nondeterministic finite transducer is defined to be a tuple M{S, 
E, r,S, I, F), where S is the set Q of states, E is the input alphabet, F is the 
output alphabet, SQQxExFxQ is the transition relation, I Q Q is the set of 
initial states and F Q Q is the set of final states. For a string wwi ■ ■ ■ Wn £ Sd* 
we define the set Tm{w) of outputs of M on input w as follows. A string v £ F* 
of length n is in Tm{w), if there is a sequence Sq, Si, . . . , Sn of states, such that 
So £ I , Sn £ F and, for every i, 1 < i < n, we have {si-i,Wi, vt, Si) £ 6. 

We say that a nondeterministic finite transducer {S, E, F,S, I, F) is a V- 
transdueer, or that a NFA {S,E,6,so,F) is a V -NFA, if applying the subset 
construction to the pre-NFA {S, E, 6) yields a pre-DFA (2'^, E, S') whose trans- 
formation monoid is in V. 

Proposition 2. The following statements are equivalent for a transducer M 
having (S,E,S) as pre-NFA: 

(i) M is a N -transducer, 

(a) For each p,q £ S, the language L{p,q) accepted by the NFA {S, E,6,p,{q}) 
is a V -language. 

We say that a nondeterministic transducer M is single-valued, if, for every 
wwi ■ ■ ■ Wn G E* there is a single computation (sq, wi, ui, si)(si, W2, ^2, S2) • • • 
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(s„_i, Wn, Sn) such that Sq € I, Sn € F, and tCj, Vi, qi) € S for 1 < f < n. 
Note that in this case, \Tm{w)\1 for all w £ S*; if Tm{w){u}, then we write 
Tm{w)u. 

Remark 3. We remark that the definition above of single-valuedness is different 
from the one given in jl jMSV 9^ . since there it was only required that the set of 
values Tm{w) consists of at most one word, for every w G F*. This still leaves the 
possibility that there may be different paths in M for a particular input w which 
produce the same output, which is forbidden in the present definition. |LMSV91?| 
equate the power of aperiodic machines according to their definition with that of 
FO-translations. Since our main result in Sect. 0 gives an analogous statement 
for aperiodic machines according to our stricter definition, we conclude that for 
the aperiodic case, both definitions coincide. 

2.4 Two-Way Automata 

A two-way automaton with output is a 7-tuple M(Cb)TZ, E, F, S, X,Iq, F), where 

— the set S of states is the disjoint union £ W 72. of a set £ (the states “entered 
from the left”) and a set 72 (the states “entered from the right”), 

— Iq G C is the initial state, 

— F Q S is the set of final states which, as in the case of one-way transducers, 
will play no role in defining the operation of a two-way automaton with 
output, 

— H is the input alphabet, F is the output alphabet, 

— S: (S' X A) U (£ X {<}) U (72 x {>}) — >■ S is a total transition function, where 
\> F and F are the leftmarker and the rightmarker respectively, 

— X: {S X F) ^ F is the output function. 

The meaning of S{s,a) £ £ is that M in state s scanning a moves its head 
to the right upon entering state ^(s, u); M moves its head to the left when 
<5(s, cr) £ 72. 

The initial configuration of M on input ww\W 2 ■■■ Wn G F* is the situation 
in which the state of M is Ig and M scans w\ within the string \>wiW 2 ■ ■ ■ Wn<i 
{M scans <1 when |w|0). We say that M eventually exits >w<l if it eventually 
encounters a transition 5{r, [>) £ 72 or a transition 5{l, <\) G C (of course M will 
generally bounce off the end markers several times before exiting). We require 
that, for any ww\W 2 ■ ■ ■ Wn £ F*, 

— M from its initial configuration on w eventually leaves every Wi to the right, 
1 < * < n, and eventually exits [>rc<l; this is analogous to the (unspoken) 
requirement that a one-way transducer must traverse its input and halt, 

— from any state I G £ scanning rci, M eventually exits >ic<l; this requirement 
is analogous to the (unspoken) fact that a one-way transducer eventually 
runs out of input regardless of its initial configuration, 

— from any state r £ 72 scanning M also eventually exits >?n<l; this is 
justified by the natural desire to maintain symmetry between left and right 
in a two-way machine. 
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Each WW 1 W 2 ■ ■ -Wn € S* coerces M into a behavior described by a behavior 
function 

{ state in which M exits W 1 W 2 ■ ■ ■ Wn when started if s G £; 
in state s scanning wi, 

state in which M exits W\W 2 - ■ ■ Wn when started if s £ 7?.. 
in state s scanning u>„, 

The base case is given by Sw(s)s if |w|0, and 5u,(s)^(s, in) if |rc|l. The induction 
step divides into two similar cases according to whether s £ £ or s £ 72.. Let 
u £ 17+ and a G S. We only illustrate the case s G C, namely M entering ua 
from the left. The case breaks off into two subcases according to whether M 
eventually falls off ua to the left {Su{sk) G 72) or to the right {Sa{lk) £ £): 

{ ^u ('5fc)if (3/b ^ ) £ 

72) such that liSu{si-i) and SiSa{h) for 1 < i < A:; 

Sailk) if (3fc > 0){3loS^{s)Jul2, . . . , lk,Sa{lk) e /:)(3ri,r2, - ■ • ,rfe £ 
72) such that rida{h-i) and h6u{ri) for 1 < i < fc. 

Finally we define the behavior monoid B{M) of M to be the monoid {(5^, | 
w G T"*} under composition. M is a 'V -machine iff B{M) G V. 

The output of M on input ww\W 2 - ■ ■ Wn is A(si, wi)A(s 2 , ^ 2 ) • • • A(sn, w^), 
where Si is the state entered by M as it lands on Wi for the last time when 
started from its initial configuration on w (when |ri;|0 the output is the empty 
string) . 

3 Results 

Here we present our different results which together prove Theorem D First 
we state that for all varieties closed under reversal, translations, bisequential 
functions, bimachines, and single-valued nondeterministic transducers yield the 
same class of functions. This proves the equivalence of the first three statements 
of Theorem n since group varieties and A are closed under reversal. 

Theorem 4. The following are equivalent when 'V is a monoid variety closed 
under reversal: 

(i) f is a V -translation. 

(a) f is a 'V -bisequential function. 

(Hi) There is a sequential 'V -bimachine that computes f. 

(iv) There is a single-valued nondeterministic Y -transducer that computes f. 

The proof, which can be found in the full version of this paper |MSTVQQ| . 
consists of a sequence of simulations among the different models that preserve 
monoid identities in the sense of Sect. 0 It even turns out that equivalence of 
(i) and (ii) holds for all varieties of monoids. Here, we only present the following 
implication: 
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Lemma 5. Let , . . . , be ~V -formulas. Then there is a "V -bisequential func- 
tion f such that [(^ 1 , . . . , (pt]f- 

Proof. We first assume that tl, i.e. that the image of the translation is over a 
two-letter alphabet, which we take to be {y, iV}. Initially, let ip(pi be a formula 
over some alphabet S of the form (p{x)QAy'4’{x,y), where A is a V-language 
and if is quantifier- free. Let a be the congruence relation that defines ACT*, 
i.e., r*/a G V. For simplicity, we assume that F{1,0}; the case of non-binary 
alphabets is treated similarly. 

For every letter a G S, we define from ijj three formulas if^, ip^ and ifff as 
follows: 



replace Ca{x) by 


Cb{x), 6 a, by 


y < a: by 


yx by 


a; < y by 


To get if< 


true. 


false. 


true. 


false. 


false. 


To get V'a 


true. 


false. 


false. 


true. 


false. 


To get if> 


true. 


false. 


false. 


false. 


true. 



Observe that all these formulas have the only free variable y, since x no longer 
appears. Moreover, for each a G S, we write if a for the constant formula obtained 
by further evaluating ip^ with the knowledge that Ca{y). Now recall that the ith 
symbol in . . . w„) is equal to Y iff 

V'(b 1) 2) ••• if{i,n) G A, (1) 

where for convenience we wrote if{i,j) for the zero-one truth value of if{i,j) 
evaluated on wi...Wn- But the {0, l}-word appearing in dQ) is precisely the 
{0, l}-word 

••• i’wi ifff^{i + l) ••• ifff.{n-l) if>.{n).{2) 

Fortunately, every if^ (for a G E, R G {<, >}) defines a length-preserving ho- 
momorphism, also denoted ifff: S* — >• {0,1}*. So let a congruence ^ on E* 
be defined as follows: u ^ -u iff ifa{u) a ifa{v) holds for each a G E and for 
each R G {<,>}. This is a V-congruence, being the intersection of finitely 
many V-congruences (indeed each separate homomorphism ifff defines a con- 
gruence whose induced monoid is the image of a submonoid of {0, l}*/a). The 
V-bisequential function /(~,m) therefore computes [ip\, where 

m: (y*/-) X (r*/-) X y ^ |y,fv} 

fM M ° H’aU ° bPa ^ A 

’ ’ otherwise. 

Here o computes in {0, l}*/o;, and m is well defined because wi ~ W 2 implies 
that ifffiwi) a ifff{w 2 ). 

Now if is a Boolean combination of formulas, we consider the homomor- 
phisms defined by all subformulas of (p of the form QAyif{x, y) for quantifier- free 
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■i/). In the case of several formulas , ipt] instead of one formula [t/s], we again 

have to deal with more homomorphisms as above. In both cases, the definition of 
m becomes more complicated, but the congruence ^ remains an intersection of 
constantly many V-congruences. Hence we still obtain a V-bisequential /. □ 

When are 2-way machines equivalent to the models of Theorem 0/ In the 
next lemma, we show that a 2- way V-automaton can always be simulated by 
a V-bimachine. The converse, however, requires additional assumptions about 
V. Lemma 0 deals with the variety of all aperiodic monoids and the variety of 
all groups. The construction that is used in the proof of Lemma Q results in 
particularly simple 2-way automata if V is an arbitrary group varity. This is 
stated as Corollary 0 

Lemma 6. Let M{C[i)TZ, E, F,6 , X,Iq, F) be a two-way V-automaton with out- 
put. Then there exists a V-bimaehine (Mi,M2,g) that eomputes Tm- 

Proof. Given M, we first show how to construct automata si, H, 5i), 

M2{S2, S 2 , E, S2) and function g. Let o denote multiplication in B{M) and let e 
be the corresponding identity. 

We define M\ as by S\B{M), s\e, and for each / S B{M),a € E, we set 
<5i(/, a)f o 5a- Intuitively, after reading wi - ■ ■ Wi-i the automaton Mi has com- 
puted the function 

The definition of M2 is analogous: S2B{M), S2B, and for each / G B{M),a G 
r, we set 62{f,a)SaO f. 

It is easy to show that, for each string ww± ■ ■ ■ Wn and each position i G 
{I, . . . , n}, we obtain 5i(si, and ^ 2 ( 52 , Wn • • • 

By combining this information with Wi and the fixed behavior of M on the end- 
markers, the sequence of states that M takes at position i can be determined. 
From the very definition of the output of a two-way automaton it follows that a 
function g as claimed exists and can be obtained directly from the definition of 
M. 

Next we show that all monoid identities that hold in M hold in Mi and M2 
as well. Let u,v G E* be such that SuSv We even prove the stronger claims that 
i5i(-, 14)151 (•,•(;) and 52 (-,m^)i52(-,i'^). 

From the definition of Mi we obtain for all / G B{M),w G E, that 5i(/, w)fo 
Suj, which immediately yields the first claim. From the definition of M2 we con- 
clude that for each / G B{M) and w G E*, we obtain S2{f, w)S.uin ° f, where 
denotes the reversal of the word w. This immediately implies the second claim. 
Hence we showed that Mi and M2 are V-DFAs. □ 

Next we turn a converse of this lemma. An important tool here is a result of 
Hopcroft and Ullman |HTT67j . showing how a two-way automaton can, for each 
symbol in its input, determine the states of one left-to-right and one right-to-left 
automaton, when they reach the corresponding input symbol. The next lemma 
shows that this result carries over to the case of restricted types of automata. 



The Many Faces of a Translation 899 



Lemma 7 . Let {Mi, M2, g) be a sequential bimaehine. Then there exists a 2 - 
way automaton M{C [i)TZ, E, T,6, whieh eomputes the same function as 

{Ml, M2, g) and has the following property: If 5 i{-, w) and S2{-, u^)62{-,v^) 

and uvvu then 



'-^uu^uv'~^vu'-’vv ■ 

Proof. We sketch the idea of the proof, and we refer the reader to [IM STVOfT] 
for the full details. Let M'{Mi,M2,g), Mi{Qi,si, E, 5 i), and M2{Q2, S2, E, 62)- 

Informally, the behavior of the automaton M on input wwi ■ ■ ■ is split into 
two phases. Roughly, the first phase consists of a complete left-to-right scan in 
which M simulates Mi and the second phase consists of n subcomputations each 
of which is a right-to-left movement followed by a (possibly empty) left-to-right 
movement. In the second phase M simulates the behavior of M2 (which is easy) 
and keeps track of the states of Mi (which is more complicated). 

In more detail, each of the n subcomputations starts from a situation in which 
the automaton is at a position i of its input and knows pi( 5 i(si, wi • • -Wi) and 
gi^2{s2j Wn ■ ■ ■ rci+i) and ends in the corresponding situation at position i — 1. 

The obvious problem is how to compute Pi_i 5 i(si, rci • • • Wi_i). If there is 
only one state p of Qi with Si{p,Wi)pi then, of course Pi-ip. Otherwise M 
proceeds as follows. It moves to the left and maintains, for each state p with 
Si{p,Wi)pi, a set Pp of states. At a position j < i, Pp contains all states from 
which Ml can reach p by reading Wj+i ■ ■ -Wi-i. It is easy to see that, for each 
j, these sets are pairwise disjoint. This process ends in one of the following two 
situations. Either all but one of the sets Pp become empty or M reaches the 
left delimiter. In either case M can easily conclude Pi-i (in the latter case it is 
the state p for which Pp contains si). Now it only needs to find its way back to 
position i. In order to do so, it goes one step to the right and chooses one state 
from Pp and one state from a nonempty Pp> at that position (these two sets were 
remembered by M). Then it simulates the behavior of Mi starting from these 
two states until the two computations flow together into a single state. At this 
time M has reached position i again and now it also knows Pi-i, as this is given 
by the outcome of the simulation of Mi which started from the correct state. 
It has remembered qi during the whole subcomputation, hence it can retrieve 
the output symbol of M at position i. Then it starts the next sub computation 
from position z — 1 . It is this last step of a subcomputation where M actually 
simulates the behavior of M2- 

The formal definition of M and the (technically involved) proof that it has 
the properties claimed in the lemma can be found in |MSTVflfl] . □ 

If V is a variety of groups then the construction in the proof of Lemma 0 
leads to a very simple 2 -way automaton. The simple reason is that, for such 
V, V-automata have injective transition functions. Hence, for each state p of 
the automaton Mi and each symbol a there is exactly one state p' such that 
6i{p' , a)p. Therefore, M can always deduce Pi-i from pi by one step to the left. 
It follows directly that all word equations that hold in Mi and (reversed) in M2 
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also hold in M and we can conclude the following corollary which considerably 
strengthens Lemma [71 for group varieties. 

Corollary 8. Let (Mi, M 2 , g) be a sequential ~V -himachine computing a func- 
tion f, where V an arbitrary group variety. Then f is computable by a 2-way 
"V -automaton. 



Theorem 9. Let V be the variety A or an arbitrary variety of groups. A func- 
tion is computed by a sequential 'V -bimachine iff it is computed by a 2-way 
~V -machine. 

Proof. Lemma El shows that a sequential V-bimachine can simulate a 2-way 
V-machine. For the converse, there are two cases. The first case is when V 
is the variety A of all aperiodic monoids. Let a V-bimachine (Mi, M 2 , g) be 
simulated by the 2- way automaton M promised by Lemma E] Since Mi and M 2 
are aperiodic, there exists n > 0 such that for all words w, 5i(-, 
il,2. But then B(M) G A because Lemma Q applies with uui" and 

where (SxY is the Ffold composition of Sx with itself. 

The second case is when V is an arbitrary group variety. In this case the 
result follows from Corollary 0 LI 



4 Conclusion 

We have characterized FO-translations in several natural ways, and we have 
extended the characterization to handle group varieties. The strong equivalences 
obtained, with the exception of the 2-way automaton characterization, extend 
to any monoid variety V closed under reversal. 

We believe that Theorem Q] generalizes to many more varieties. The hurdle 
lies in the fine details required to ensure that a 2-way automaton simulating 
any one of the other models preserves enough of the algebraic structure. We feel 
here that we have not exhausted all the tricks. For example, crossing sequence 
arguments may be applicable to improve the construction and its analysis. 

Our careful definition of what constitutes a 2- way V-automaton opens the 
way for an algebraic treatment of 2-way automata. The translations studied 
here are akin to the maps underlying the wreath product and the block product 
constructions. This suggests approaches to handle the nesting of monoidal quan- 
tifiers in the logical framework, or equivalently to handle the series connections 
of 2-way automata. This emerges as a promising avenue for future research. 
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Abstract. A constructive version of Hausdorff dimension is developed 
and used to assign to every individual infinite binary sequence A a con- 
structive dimension, which is a real number cdim{A) in the interval [0, 1]. 
Sequences that are random (in the sense of Martin-L6f) have construc- 
tive dimension 1, while sequences that are decidable, r.e., or co-r.e. have 
constructive dimension 0. It is shown that for every Z\ 2 -computable real 
number a in [0, 1] there is a A 2 sequence A such that cdim{A) = a. 
Every sequence’s constructive dimension is shown to be bounded above 
and below by the limit supremum and limit infimum, respectively, of 
the average Kolmogorov complexity of the sequence’s first n bits. Every 
sequence that is random relative to a computable sequence of rational 
biases that converge to a real number /3 in (0, 1) is shown to have con- 
structive dimension W(/3), the binary entropy of /3. 

Constructive dimension is based on constructive gales, which are a nat- 
ural generalization of the constructive martingales used in the theory of 
random sequences. 

Keywords: algorithmic information, computability, constructive dimen- 
sion, gales, Hausdorff dimension, Kolmogorov complexity, martingales, 
randomness. 



1 Introduction 

One of the most dramatic achievements of the theory of computing was Martin- 
Lof’s 1966 use of constructive measure theory to give the first satisfactory defi- 
nition of the randomness of individual infinite binary sequences m- The search 
for such a definition had been a major object of early twentieth-century research 
on the foundations of probability, but a rigorous mathematical formulation had 
proven so elusive that the search had been all but abandoned more than two 
decades earlier. Martin-Lof’s definition says precisely which infinite binary se- 
quences are random and which are not. The definition is probabilistically con- 
vincing in that it requires each random sequence to pass every algorithmically 
implementable statistical test of randomness. The definition is also robust in that 
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subsequent definitions by Schnorr 121113 EH, Levin CH, Chaitin Solovay 
El, and Shen ’ E3EE1, using a variety of different approaches, all define exactly 
the same sequences to be random. It is noteworthy that all these approaches, 
like Martin-Lof’s, make essential use of the theory of computing. 

One useful characterization of random sequences is that they are those se- 
quences that have maximal algorithmic information content. Specifically, if 
K{A\Q..n — 1]) denotes the Kolomogorov complexity (algorithmic information 
content) of the first n bits of an infinite binary sequence A, then Levin 111 I and 
Chaitin ^ have shown that A is random if and only if there is a constant c such 
that for all n, K{A[0..n — 1]) > u — c. Indeed Kolmogorov |S| developed what is 
now called C{x), the “plain Kolmogorov complexity,” in order to formulate such 
a definition of randomness, and Martin-L6f, who was then visiting Kolmogorov, 
was motivated by this idea when he defined randomness. (The quantity C{x) was 
also developed independently by Solomonoff m and Chaitin 121 130 Martin-L6f 
H3 subsequently proved that C{x) cannot be used to characterize randomness, 
and Levin mi and Chaitin ^ introduced a technical modification of C (x) , now 
called K{x), the “Kolmogorov complexity,” in order to prove the above charac- 
terization of random sequences. Schnorr 121 proved a similar characterization 
in terms of another variant, called the “monotone Kolmogorov complexity.” 

One conclusion to be drawn from these characterizations is that the def- 
inition of random sequences distinguishes those sequences that have maximal 
algorithmic information content from those that do not. It offers no quantitative 
classification of the sequences that have less than maximal algorithmic informa- 
tion content. From a technical point of view, this aspect of the definition arises 
from its use of constructive measure, which is an algorithmic effectivization of 
classical Lebesgue measure. Specifically, an infinite binary sequence A is random 
if the singleton set {A} does not have constructive measure 0, and is nonran- 
dom if {A} does have constructive measure 0. Neither Lebesgue measure nor 
constructive measure offers quantitative distinctions among measure 0 sets. 

In 1919, Hausdorff [B| augmented classical Lebesgue measure theory with a 
theory of dimension. This theory assigns to every subset X of a given metric 
space a real number dimH{X), which is now called the Hausdorff dimension 
of X. In this paper we are interested in the case where the metric space is the 
Cantor space C, consisting of all infinite binary sequences. In this case, the Haus- 
dorff dimension of a set X C C (which is defined precisely in section 3 below) 
is a real number dimuiX) G [0, 1]. The Hausdorff dimension is monotone, with 
dimni^) — 0 and dimniC) = 1- Moreover, if dimniX) < dimniC), then X is 
a measure 0 subset of C. Hausdorff dimension thus offers a quantitative classifi- 
cation of measure 0 sets. Moreover, Ryabko ^3 Staiger [2SIED1, and Cai 

and Hartmanis have all proven results establishing quantitative relationships 
between Hausdorff dimension and Kolmogorov complexity. 

Just as Hausdorff [3 augmented Lebesgue measure with a theory of dimen- 
sion, this paper augments the theory of individual random sequences with a 
theory of the constructive dimension of individual sequences. Specifically, we 
develop a constructive version of Hausdorff dimension and use this to assign ev- 
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ery sequence A G C a constructive dimension cdim(A) G [0,1]. Sequences that 
are random have constructive dimension 1, while sequences that are decidable, 
r.e., or co-r.e. have constructive dimension 0. For every real number a G [0, 1] 
there is a sequence A such that cdim{A) = a. Moreover, if a is zl^-computable, 
then there is a A 2 sequence A such that cdim{A) = a. (This generalizes the 
well-known existence of sequences that are random.) 

Regarding algorithmic information content, we prove that for every sequence 
Ag C, 



. K(A\A..n—V\) , , K(A[0..n — l]) 

liminf ^ — < dmi{A) < limsup 

TL yC!C> Jl Yl — ^00 ^ 

This justifies the intuition that the constructive dimension of a sequence is a 
measure of its algorithmic information density. 

We also relate constructive dimension to Shannon entropy. If = (/?o, /?i, . . . ) 
is any computable sequence of rational numbers fii G [0, 1] that converge to a 
real number /3 G (0, 1) (which must therefore be zl^'Computable), we show that 
every sequence that is random relative to has constructive dimension 'H(P), 
the binary entropy of /3. 

Our development of constructive dimension is based on gales, which are nat- 
ural generalizations of the constructive martingales used by Schnorr 
to characterize randomness. In a recent paper HSl we have shown that gales can 
be used to characterize the classical Hausdorff dimension, and that resource- 
bounded gales can be used to define dimension in complexity classes. In the 
present paper we use constructive (lower semicomputable) gales to develop con- 
structive dimension. Constructive dimension differs markedly from both classi- 
cal Hausdorff dimension and the resource-bounded dimension developed in uni, 
primarily due to the existence of gales that are optimal. These optimal gales, 
defined in section 4, are analogous to universal tests of randomness in the theory 
of random sequences. 



2 Preliminaries 

We work in the Cantor space C consisting of all infinite binary sequences. The 
n-bit prefix of a sequence H G C is the string A[0..n — 1] G {0, 1}* consisting of 
the first n bits of A. We say that a string w G {0, 1}* is a prefix of a sequence 
A G C, and we write w C H if H[0..|r(;| — 1] = w. The cylinder generated by a 
string w G {0, 1}* is C^, = {AgC\w Q A}. Note that = C, where A is the 
empty string. 

Definition. A probability measure on C is a function 1 / : {0, 1}* — >■ [0, 1] with 
the following two properties. 



(i) v{X) = 1. 

(ii) For all w G {0, 1}*, v{w) = iy{w0) + iy{wl). 
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Intuitively, v{w) is the probability that A G when A G C is “chosen 
according to the probability measure vA 

Definition. A bias sequence is a sequence = {f3o, Pi, P 2 , ■ ■ ■), where each 
Pi G [0, 1]. A bias sequence = {Po,Pi, ■ ■ ■) is strongly positive if there exists 
i5 > 0 such that for alH G N, <5 < < 1 — 5. 



Definition. If is a bias sequence, then the -coin-toss probability measure 
on C is the probability measure 



{ 0 , 1 }* ^[ 0 , 1 ] 



|m|-i 

= n 

i=0 



where Pi(w) = (2/3j — l)w[z] + (l — /3i), i.e., Pi(w) = if ■u;[z] = 1 then Pi else 1-/3*. 



Note that is the probability that A G when A G C is chosen 

according to a random experiment in which for each z, independently of all other 
j, the z*^ bit of A is decided by tossing a 0/1- valued coin whose probability of 1 
is Pi- 



Definition. If /? G [0,1], then the P-coin-toss probability measure on C is the 
probability measure fip = /i-^, where = {P, P, P, . . 

Definition. The uniform probability measure on C is the probability measure p, 
defined by p{w) = for all w G (0, 1}*. (Note that p = pi.) 

We use several conditions involving the computability of real numbers and 
real- valued functions in this paper. 

Definition. Let a G K, 



1. a is computable if there is a computable function / : N — >■ Q such that for 
all r G N, |/(r) — a| < 2“’’. 

2. a is lower semicomputable if its Dedekind left cut left{a) = (s G Q | s < a} 
is recursively enumerable. 

3. a is A 2 - computable if a is computable relative to the halting oracle. 
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It is well known and easy to verify that every computable real is lower semi- 
computable, that every lower semicomputable real is Zi^'Computable, and that 
the converses of these statements do not hold. 

Definition. Let f : D ^ M., where D is some discrete domain such as {0, 1}* or 
N. Then / is lower semicomputahle if its lower graph 

Graph~{f) = {(x, s) G D x Q \ s < f{x)} 

is recursively enumerable. 

A prefix set is a set B C {0, 1}* such that no element of B is a prefix of any 
other element of B. 

The reader is referred to the text by Li and Vitanyi H2| for the definition 
and basic properties of the Kolmogorov complexity K{x), defined for strings 
X G {0, 1}*. Falconer [S| provides a good overview of Hausdorff dimension. 

3 Gales and Hausdorff Dimension 

In this section we define gales and supergales and use these to define classical 
Hausdorff dimension in the Cantor space C. Our definitions are slightly more 
general than those in HH] because here we need to define gales and supergales 
relative to an arbitrary (not necessarily uniform) probability measure on C. 

Definition. Let iz be a probability measure on C, and let s G [0,oo). 

1. A v-s- super gale is a function d : {0, 1}* — >■ [0,oo) that satisfies the condition 

d{w)v{wY > d(w0)i/(u'0)® -I- d{wl)v{wiy (*) 

for all w S {0, 1}*. 

2. A v-s-gale is a zz-s-supergale that satisfies (*) with equality for all w G 

{ 0 , 1 }*. 

3. A v-supermartingale is a zz- 1-supergale. 

4. A v-martingale is a z/-l-gale. 

5. An s-supergale is a /i-s-supergale. 

6. An s-gale is a /x-s-gale. 

7. A supermartingale is a 1-supergale. 

8. A martingale is a 1-gale. 

The following obvious but useful observation shows how gales and supergales 
are affected by variation of the parameter s. 

Observation 3.1 Let v be a probability measure on C, let s, s' G [0, oo), and 
let d,d' : (0, 1}* — >■ [0, oo). Assume that 

d{w)v{wY = d'{w)v{wY 

for all w G (0, 1}* . 
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1. d is a v-s-supergale if and only if d' is a u-s' -supergale. 

2. d is a v-s-gale if and only if d' is a v-s'-gale. 

For example, Observation 3.1 implies that a function d : {0,1}* — >■ [0,oo) is 
an s-gale if and only if the function d' : (0, 1}* — ?> [0,oo) defined by d' {w) = 
2(i-'’)l“lc?(i(;) is a martingale. 

The following useful lemma is a generalization of Kraft’s inequality. 

Lemma 3.2. Let d be a v-s-supergale, where v is a probability measure on C 
and s G [0,oo). Then for all w G (0, 1}* and all prefix sets B C (0, 1}*, 

d{wu)v{wuY < d{w)v{wY . 

ueB 



Definition. Let d be a i^-s-supergale, where is a probability measure on C 
and s G [0, oo). 

1. We say that d suceeeds on a sequence A G C if limsup„_,.oo d(A[0..n — 1]) = 

oo. 

2. The success set of d is = {A G C \ d succeeds on A}. 

We now show how to use the success sets of gales and supergales to define 
Hausdorff dimension. 

Notation. Let X C C. 

1. G{X) is the set of all s G [0, oo) such that there is an s-gale d for which 
X C S°°[d]. 

2. G{X) is the set of all s G [0, oo) such that there is an s-supergale d for which 
X C 5'°°[d]. 



Lemma 3.3. For all X C C, G(X) — G(X). 

It was shown in that the following definition is equivalent to the classical 
definition of Hausdorff dimension in C. 

Definition. The Hausdorff dimension of a set W C C is dimniX) = inf G{X). 

Note that by Lemma 3.3 we could equivalently use G{X) in place of G{X) in 
the above definition. 

4 Constructive Dimension of Individual Sequences 

In this section we constructivize the above definition of Hausdorff dimension and 
use this to define the constructive dimensions of individual sequences. We then 
develop some fundamental properties of constructive dimension. 

Terminology. A z/-s-supergale is constructive if it is lower semicomputable. 



908 



J.H. Lutz 



Notation. Let X C C. 



1- ^/constr(-^) is the Set of all s £ [0, oo) such that there is a constructive s-gale 
d for which X C S^[d]. 

2- ^constr(-^) is the set of all s £ [0, oo) such that there is constructive s- 
supergale d for which X C S°° [d] . 



The following constructive analog of Lemma 3.3 is proven using Observation 
3.1 and known properties of constructive martingales. 

Lemma 4.1. For all X £ C , Qconstr{X) is a dense subset of Q constr{X) . 

In light of the foregoing, the following definition is quite natural. 

Definition. The eonstructive dimension of a set X C C is cdim{X) 
— illf ^constr(^)- 

By Lemma 4.1, we also have cdim{X) = inf ^constr(^)- 
Recall that a set X C C has eonstruetive measure 0 if there is a constructive 
martingale d such that X C S'°°[d]. The following useful observations are clear. 

Observations 4.2 1. For all X CY C C, cdim{X) < cdim(Y). 

2. For all X C C, cdim{X) > dim}j{X) . 

3. cdimiC) = 1. 

4 .. For all X C C, if cdim{X) < 1, then X has eonstructive measure 0. 

We now introduce the crucial concept of optimal constructive gales. 

Definition. Let s £ [0,oo). A constructive s-gale d is optimal if for every con- 
structive s-gale d' there is a constant e > 0 such that for all w £ {0, 1}*, 
d{w) > ed'{w). 

Schnorr Ea EH] has established the existence of an optimal constructive 
martingale, which we call d^^^. For each s £ [0,oo), we define the function 

d'®) : {0,1}* ^ [0,oo) 

The following important result follows immediately from Observation 3.1 and 
the optimality of d^^'> . 

Theorem 4.3. For every computable real number s £ [0,oo), the function d^'*^ 
is an optimal constructive s-gale. 
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Definition. The constructive dimension of a sequence A G C is cdim{A) = 
cdim{{A\). For each a S [0, 1], we write DIM“ = {A € C | cdim(A) = a} and 
DIM-“ = {yl e C I cdim{A) < a}. 

We first give a simple diagonalization proof that for every real number a G 
[0, 1], there is a sequence whose constructive dimension is a. 

Lemma 4.4. For all a G [0, 1], DIM“ 0. 

The constructive dimension of a set can be completely characterized in terms 
of the constructive dimensions of its elements in the following way. 

Lemma 4.5. For all X <G C, cdim{X) = sup^g^ cdim{A). 

Lemma 4.5, which has no analog either in classical Hausdorff dimension or in 
the resource-bounded dimension developed in IS. depends crucially on Theorem 
4.3. Lemma 4.5 yields an easy proof that constructive dimension has the following 
property (which is also a property of classical Hausdorff dimension) . 

Corollary 4.6. For all Xq, Xi, X 2 , ■ ■ ■ C C, cdim(U^QXfe) =supfcgpj cdim(Xfc). 

Lemmas 4.4 and 4.5 also have the following consequence, which states that 
for each a G [0, 1], DIM-“ is the largest set of dimension a. 

Corollary 4.7. For every real number a G [0, 1], the set DIM-“ has the follow- 
ing two properties. 

1. cdim(DIM-“) = a. 

2. For all X GC, if cdim{X) < a, then X C DIM-“. 

We also have the following. 

Corollary 4.8. For every real number a G [0, 1], cdzm(DIM“) = a. 

Recall that a sequence H S C is random if the singleton set {A} does not have 
constructive measure 0. We write RAND for the set of all random sequences. 
The following is obvious. 

Observation 4.9 RAND C DIM^. 

It is well known that there are no random sequences in the set U 7T°, 
consisting of all characteristic sets of r.e or co-r.e. sets. We now show that much 
more is true. 



Lemma 4.10. A® U 77° C DIM°. 
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An important result in the theory of random sequences is the existence of 
random sequences in A^. By Observation 4.9, this immediately implies the exis- 
tence of A 2 sequences of constructive dimension 1. We would like to extend this 
result to other positive dimensions. We first note that, since A 2 is countable, 
there can only be A 2 sequences of countably many different constructive dimen- 
sions. We also note that the proof that RAND fl ^ 0 using Kreisel’s Basis 
Lemma CniEllTHI and the fact that RAND is a If® class does not directly carry 
over to the present question because the class DIM“ does not appear to be 
(Note: Indeed, Terwijn has very recently proven that it is not.) Instead, we 
first prove a dimension reduction theorem, which has independent interest, and 
then use this to derive the existence of sequences of constructive dimension 

a (for suitable a) from the existence of sequences that are random. 

Define an approximator of a real number a € [0, 1] to be an ordered pair 
(a, b) of computable functions a, 6 : N — >■ with the following properties. 

(i) For all n G N, a(n) < b(n). 

(ii) lim„^oo = a- 

It is well known and easy to see that a real number a G [0,1] has an ap- 
proximator if and only if it is A^-computable. Moreover, every Zl^-computable 
real number has an approximator (a, b) that is nice in the sense that if we let 

as fc — >■ 00 . 

Given an approximator (a,b) of a A^-computable real number a G [0, 1], we 
define the (a, b)-dilution function 



9(a,b) : C — > C 



as follows. Given A G C, if we write 

A = W 0 W 1 W 2 ■ . . , 

where |w„| = a(n) for each n G N, then 

g(a,b)(A) = .... 

Note that g(^a,b){A) =1 A for all A G C. 

Theorem 4.11. Let a G [0, l]&e A^-eomputable, and let (a, b) be a nice approx- 
imator of a. Then for all A G C, 

cdim{g(^a,b){^)) = Oi ■ cdim{A). 

By Theorem 4.11, Observation 4.9, and the known existence of A® sequences 
that are random, we now have the following. 

Theorem 4.12. For every A^-eomputable real number a G [0, 1], DIM“ 0 A® yf 
0, i.e., there is a A® sequence A such that cdim{A) = a. 
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Note that the proof of Theorem 4.12 via Theorem 4.11 yields even more, 
namely, that if a, /3 G [0, 1] are Zl^-computable with a > /3, then every sequence 
in DIM“ is 1-equivalent to some sequence in DIM^. 

We now relate constructive dimension to Kolmogorov complexity. 

Theorem 4.13. For all A G C, 

. K{A^..n—l]) K{A[Q..n — V\) 

hminf — < cdim{A) < hmsup ^ 

n yoo Tl ji — yoo Tl 

Much of the technical content of the proof of Theorem 4.13 can be found in 
the investigations by Ryabko and Staiger El EH of the relationships 

between Kolmogorov complexity and classical Hausdorff dimension, and also in 
the calculation by Cai and Hartmanis ^ of the Hausdorff dimension of the graph 
of the average Kolmogorov complexities of real numbers. 

Theorem 4.13 justifies the intuition that the constructive dimension of a 
sequence is its algorithmic information density. 

Our last result on constructive dimension relates randomness over non-uni- 
form distributions to constructive dimension. Recall that a sequence H G C is 
random relative to a probability measure on C if there is no constructive v- 
martingale d such that A G Given a bias sequence , we write RAND^ 

for the set of all sequences that are random relative to the "^-coin-toss proba- 
bility measure defined in section 2. Recall also the binary entropy function 

n{(3) = /31og i-k (1-/3) log 
of Shannon information theory. 

Theorem 4.14. If~$ is a eomputable sequence of rational biases that converges 
to a real number (3 G (0,1), then 

RAND^ C 

Note that Observation 4.9 is exactly the case = (i,|,^,...) of Theorem 
4.14. Note also that Theorem 4.14 can be used to give a second (albeit less 
informative) proof of Theorem 4.12. 

Computable bias sequences that converge slowly to | have played an im- 
portant role in the investigation of stochasticity versus randomness. First, van 
Lambalgen m and, independently, Vovk m proved that if is a bias sequence 
such that — |)^ = oo, then RAND^flRAND = 0. Also, van Lambalgen 

m proved that if is any computable bias sequence that converges to then 
every element of RAND^ is Church-stochastic. Taking to converge to but 
to do so slowly enough that (e-g-, A = 5 + 

a new proof that not every Church-stochastic sequence is random. More signif- 
icantly, Shen' m strengthened van Lambalgen’s latter result by showing that 
if is any computable bias sequence that converges to then every element 
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of RAND^ is Kolmogorov-Loveland stochastic. Again taking p to converge to 
i slowly enough that ~ this allowed Shen' to conclude that 

not every Kolmogorov-Loveland stochastic sequence is random, thereby solving 
a twenty- year-old problem of Kolmogorov BE and Loveland mini. Theorem 
4.14 has the following consequence concerning such sequences . 

Corollary 4.15. If~$ is a computable sequence of rational biases that converges 
to ^ slowly enough that YlTLoiPi ~ 5 )^ = then 



RAND^ C DIM^ - RAND. 

That is, every sequence that is random with respect to such a bias sequence 
is an example of a sequence that has constructive dimension 1 but is not 
random. 
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Abstract. We consider separations of reducibilities by random sets. 
First, we show a result on polynomial-time bounded reducibilities which 
query their oracle non-adaptively: for every p-random set R, there is a 
set which is reducible to R with fc -|- 1 queries, but is not reducible to any 
other p-random set with at most k queries. This result solves an open 
problem stated in a recent survey paper by Lutz and Mayordomo ini. 
Second, we show that the separation result above can be transferred from 
the setting of polynomial time bounds to a setting of rec-random sets and 
recursive reducibilities. This extends the main result of Book, Lutz, and 
Martin [S|, who, by using different methods, showed a similar separation 
with respect to Martin-Lof-random sets. Moreover, in both settings we 
obtain similar separation results for truth-table versus bounded truth- 
table reducibility. 



1 Introduction and Related Work 

We consider separations of reducibilities in the context of resource-bounded mea- 
sure theory. In the following, we use the symbol < with appropriate sub- or 
superscripts to denote reducibilities, i.e., binary relations on Cantor space, the 
class of all sets of natural numbers. We say two reducibilities <r and <g are 
separated by an oracle A if the lower spans of A with respect to these reducibil- 
ities, i.e., the classes {X : X <r A} and {X : X <s A}, differ. It is easy to see 
that two reducibilities are different (as binary relations on Cantor space) if and 
only if they are separated by some oracle. Beyond this simple observation, the 
question of which reducibilities are separated by what kind of oracles has been 
the object of intensive studies. Here, for a given pair of reducibilities, typical 
question are the following. Are there separating oracles of low complexity? How 
comprising is the class of separating oracles? Which properties are sufficient for 
being a separating oracle? 

Ladner, Lynch, and Selman m considered separations of the usual poly- 
nomial-time bounded reducibilities in the range between p-m- and p-T-reduci- 
bility (see Sect. El for definitions). They showed that for every distinct pair of 
such reducibilities, there is a separating oracle which can be computed in expo- 
nential time. Subsequently, in their seminal paper jS|, Bennett and Gill obtained 
results about separations by almost all oracles, i.e., they showed that for certain 

U. Montanari et al. (Eds.): ICALP 2000, LNCS 1853, pp. 014- B^ 2000. 
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pairs of reducibilities the class of separating oracles has measure 1 with respect 
to uniform measure on Cantor space. In fact, for every fc > 0, every pair of 
distinct reducibilities chosen among p-T-, p-tt, p-btt, p-btt(fc + l), and p-btt(fc)- 
reducibility can be separated by random oracles, see [ltil21| . as well as for a. 
separation of p-btt(fc + 1)- and p-btt (fc)-reducibility by almost all tally oracles. 

A separation by almost all oracles can be expressed equivalently by saying 
that the class of oracles which do not separate the reducibilities under con- 
sideration has uniform measure 0. Lutz and Mayordomo m could show for 
certain pairs of polynomial-time bounded reducibilities of truth-table type that 
the class of non-separating oracles does not just have uniform measure 0 but 
is in fact covered by a polynomial-time computable martingale. In particular, 
they showed that for every natural number fc, there is a polynomial-time com- 
putable martingale which covers all oracles which do not separate p-btt (fc -1-1)- 
and p-btt(fc)-reducibility, whence, in particular, these reducibilities are separated 
by every p-random oracle. The latter can be rephrased by saying that these re- 
ducibilities are locally separated by the class of p-random oracles. Here, formally, 
a nonempty class C locally separates two given reducibilities iff for every set A 
in C, the lower spans of A with respect to these reducibilities are different. 

We say a class C globally separates two given reducibilities in case for every 
set A in C there is a set B which is reducible to A with respect to one of the given 
reducibilities but B is not reducible to any set in C with respect to the other 
reducibility. Moreover, in case such a set B exists not for all but just for some 
sets A in C, we say that C yields a weak global separation of the reducibilities 
under consideration. The definition of global separation is symmetric in the 
reducibilities involved, however, for reducibilities <r and <s where X <r Y 
implies X <s Y, sets A and B as above must satisfy B <s A and B A (in 
fact, B Z for all Z in C), and similar remarks hold for the other concepts 

of separation mentioned so far. In distinguishing local and global separation we 
follow Book, Lutz, and Martin |^, who discuss such separations for the classes 
of Martin-Lof-random, tally, and sparse sets. 

In the sequel we will consider global separations by various classes of random 
sets. Such investigations can be viewed as part of a more comprising research 
project where one asks which types of reductions are able to transform random 
objects into what types of far from random objects - see [121 and HSl for results 
in this direction and for further discussion and references. 

Remark 1. By definition, every local or global separation by a class C extends 
trivially to every nonempty subclass of C. 

In Theorem El we show that the class of p-random oracles yields a global sep- 
aration of p-btt(fc-|- 1)- and p-btt (fc)-reducibility. This, together with RemarkEI 
solves Problem 7 in the recent survey article mi, where it has been asked to 
prove or disprove that, in our terms, the class of p-random oracles yields a weak 
global separation of these reducibilities. In Sect. El then we obtain by basically 
the same proof as for Theorem^that for every natural number fc, the class of rec- 
random sets globally separates p-btt(fc-|- l)-reducibility from btt(fc)-reducibility, 
i.e., from the reducibility restricted to at most fc non-adaptive queries where the 
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reductions are computed by total Turing machines which might run in arbitrary 
time and space. By Remark ^ this yields as a special case the main result of 
Book, Lutz, and Martin 0, who showed, by using different methods, a corre- 
sponding global separation with respect to the class of Martin-Lof-random sets, 
which is a proper subclass of the class of rec-random sets. Moreover, we will argue 
that in both settings, i.e., for polynomial-time bounded, as well as for recursive 
reductions and martingales, the corresponding random sets globally separate the 
corresponding notions of truth-table and bounded truth-table reducibility. 



2 Notation 

The notation used in the following is mostly standard, for unexplained nota- 
tion refer to and All strings are over the alphabet S = {0, 1}. 

We identify strings with natural numbers via the isomorphism which takes the 
length-lexicographical ordering on {A, 0, 1, 00, . . . } to the usual ordering on w, 
the set of natural numbers. If not explicitly stated differently, the terms set and 
class refer to sets of natural numbers and to sets of sets of natural numbers, 
respectively. 

A partial characteristic function is a (total) function from some subset of the 
natural numbers to {0, 1}. A partial characteristic function is finite iff its domain 
is finite. The restriction of a partial characteristic function /3 to a set I is denoted 
by P\I, whence in particular for a set X, the partial characteristic function X\I 
has domain I and agrees there with X. We identify strings of length n in the 
natural way with a partial characteristic function with domain {0, ... ,n — 1}, 
whence in particular strings can be viewed as prefixes of sets. For a partial 
characteristic function a with domain {zq < ... < z„_i}, the string associated 
with a is the (unique) string (3 where /3(j) = a{zj) for j = 0, . . . , n — 1. For a 
set X and a partial characteristic function cr we write {X, a) for the set which 
agrees with a for all arguments in the domain of cr and which agrees with X, 
otherwise. 

We will consider the following polynomial-time bounded reducibilities: Tur- 
ing reducibility (p-T), truth-table reducibility (p-tt), where the queries have to 
be asked non- adaptively, bounded truth-table reducibility (p-btt), where for each 
reduction the number of queries is bounded by a constant, and, even more re- 
strictive, p-btt (fc)-reducibility, where for all reductions this constant is bounded 
by the natural number k. The relation symbol refers to p-btt-reducibility, 
and relation symbols for other reducibilities are defined in a similar fashion. Ex- 
pressions such as p-T-reduction and <l(.-reduction will be used interchangeably. 
We represent p-btt-reductions by a pair of polynomial time computable func- 
tions g and h where g{x) gives the set of strings queried on input x and h{x) 
is a truth-table of a Boolean function over k variables which specifies how the 
answers to the queries in the set g{x) are evaluated. Here we assume, firstly, 
via introducing dummy variables, that the cardinality of g{x) is always exactly 
k and, secondly, by convention, that for i = 1, . . . ,k, the ith argument of the 
Boolean function h{x) is assigned the ith query in g{x). 



The Global Power of Additional Queries to p-Random Oracles 



917 



3 Resource-Bounded Measure 

We give a brief introduction to resource-bounded measure which focusses on 
the concepts that will be used in subsequent sections. For more comprehensive 
accounts of resource-bounded measure theory see the recent survey papers by 
Ambos-Spies and Mayordomo ^ and by Lutz M- 

The theory of resource-bounded measure is usually developed in terms of 
martingales, which can be viewed as payoff functions of gambles of the following 
type. A player successively places bets on the individual bits of the characteristic 
sequence of an unknown set A or, for short, the player bets on A. The betting 
proceeds in rounds i = 1 , 2 ,... where during round i, the player receives the 
length i — 1 prefix of A and then, firstly, decides whether to bet on the ith bit 
being 0 or 1 and, secondly, determines the stake by specifying the fraction of 
the current capital which shall be bet. Formally, a player can be identified with 
a betting strategy b : {0, 1}* — )► [—1, 1] where the bet is placed on the next bit 
being 0 or 1 depending on whether b(w) is negative or nonnegative, respectively, 
and where the absolute value of the real b{w) is the fraction of the current capital 
that shall be at stake. 

The player starts with strictly positive, finite capital. At the end of each 
round, in case the current guess has been correct, the capital is increased by 
this round’s stake and, otherwise, is decreased by the same amount. So given a 
betting strategy b, we can inductively compute the corresponding payoff function 
d by applying the equations 

d{w 0 ) = d{w) — b{w) ■ d{w) d{wl) = d{w) + b{w) ■ d{w) . 

Intuitively speaking, the payoff d{w) is the capital the player accumulates till the 
end of round |w| by betting on a set which has the string re as a prefix. Conversely, 
every function d from strings to nonnegative reals which for all strings w, satisfies 
the fairness condition 

d(w) = , (1) 

induces canonically a betting function b, where 

, . d(wl) — d(w 0 ) 1 

b{w) = • —7 — r 

2 d{w) 

in case d(w) differs from 0 and b{w) = 0, otherwise. We call a function d from 
strings to nonnegative reals a martingale iff d{X) > 0 and d satisfies the fairness 
condition (^3) for all strings w. 

By the preceding discussion it follows for gambles as described above that the 
possible payoff functions are exactly the martingales and that in fact there is a 
one-to-one correspondence between martingales and betting strategies. We will 
frequently identify martingales and betting strategies via this correspondence 
and, if appropriate, notation introduced for martingales will be extended to the 
induced betting strategies. 
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We say a martingale d succeeds on a set A if d is unbounded on the prefixes 
of A, i.e., if limsup„ ^ d(A|{0, . . . ,n}) = oo, and d succeeds on or covers a 
class iff d succeeds on every set in the class. It has been shown by Ville that a 
class has uniform measure 0 iff the class can be covered by some martingale (see 
GUI)- Thus every countable class and, in particular, most of the classes consid- 
ered in complexity and recursion theory can be covered by martingales, whence 
in order to distinguish such classes in terms of coverability one has to restrict 
the class of admissible martingales. In the context of recursion theory, this led 
to the consideration of recursive martingales, whereas in connection with com- 
plexity classes one has to impose additional resource-bounds, see 
and 1 1 |ipi 4f I,9f21)j . respectivelyO Here, in general, for a given class C one is in- 
terested in finding a class of martingales which allows the covering of interesting 
subclasses of C, but not of C itself. 

In connection with measure on complexity classes, most attention has been 
received by measure concepts for the exponentially time-bounded classes E = 
DTIME(2**") and EXP = DTIME(2P°*y). For example, in the case of the class 
E, Lutz proposed to use martingales which on input w are computable in time 
polynomial in the length of w. 

We say a set is p-random if the set cannot be covered by a polynomial-time 
computable martingale, and we write p-RAND for the class of all p-random 
sets. The notion rec-random set and the class rec-RAND of all rec-random sets 
are defined likewise with recursive martingales in place of polynomial-time com- 
putable ones. Moreover, we will consider Martin-Lof-random sets, which have 
been introduced in m and have been characterized equivalently by Schnorr m 
as the sets which cannot be covered by so-called subcomputable martingales. 
The classes of p-random, rec-random, and Martin-L6f random sets all have uni- 
form measure 1, whence each of these classes of random sets can, in a sense, be 
viewed as class of typical sets. For a proof it suffices to observe that the class of 
sets on which a single martingale succeeds always has uniform measure 0 and, 
by (T-additivity, the same holds for every countable union of such classes. 

We conclude this section by two remarks in which we describe standard 
techniques for the construction of martingales. 

Remark 2. Let a finite set D be given, as well as a list {Di, . . . , Dm) of pairwise 
disjoint subsets of D which all have the same cardinality fc > 0. Then for a partial 
characteristic function a with domain D and a string w of length k we might 
ask for the frequency 



,Dm)) ■= 



\{j ■ w is the associated string of a\Dj}\ 



m 



with which w occurs in a as associated string at the positions specified by the 
Di. In case the sets Di are clear from the context, we suppress mentioning them 
and write a{a,w), for short. 

^ An effective martingale d is always confined to rational values and is computed 
by a Turing machine which on inpnt w outputs an appropriate finite representation 
of d{w). 




The Global Power of Additional Queries to p-Random Oracles 



919 



If we choose the bits of cr by independent tosses of a fair coin, then for every w 
of length k, the expected value of a(cr, w) is 1/2''. It is suggestive to assume that 
for large m, only for a small fraction of all partial characteristic functions with 
domain D the frequency of w will deviate significantly from the expected value. 
Using Chernoff bounds (see for example Lemma 11.9 in mi one can indeed 
show that given k and a rational e > 0, we can compute a natural number 
m{k, e) such that for all m > m{k, e) and for all D and Di, . . . , Dm as above we 
have 



|{cr:Zl— >-{0,1} : (^ - ^ < 0!(cr, U", (Hi, . . . , Dm)) < § ‘ ^}| 

21^1 



> 1-e . 

( 2 ) 



Remark 3. Let / be a finite set and let 6> be a subset of all partial characteristic 
functions with domain I. We can easily construct a martingale which by betting 
on places in /, increases its capital by a factor of 2l^l/|0| for all sets B where 
H| J is in 0. Here the martingale takes the capital available when betting on the 
minimal element of I and distributes it evenly among the elements of 0, then 
computing values upwards according to the fairness condition for martingales. 

4 Separations by p-Random Oracles 

Theorem 21 extends Lutz and Mayordomo’s local separation of p-btt(fc + 1)- and 
p-btt(fc)-reducibility in m to a global separation. Recall that the lower <-span 
of a class C is the class of all sets which are <-reducible to some set in C. 

Theorem 4. Let R be a p-random set and let k be a natural number. Then 
the lower p-btt{k + l)-span of R is not eontained in the lower p-btt{k)-span of 
p-RAND. 

Proof. In order to define a set A and a p-btt(fc+ l)-reduction {go, ho) from A to 
R we let ho{x) be the truth-table of the (fc -|- l)-place conjunction and we let 

go{x) := , A := {x : go{x) C R} . (3) 

We are done if we can show that if A is p-btt(fc)-reducible to a set, then this 
set cannot be p-random. So let B be an arbitrary set and assume that A is 
reducible to B via the p-btt(fc)-reduction {g, h). We will construct a polynomial- 
time computable martingale d which succeeds on B. To this end, we define a 
sequence no,n\, . . . with 

no = 0 , rii+i > 2"" , logrii+i (4) 

(here m{., .) is the function defined in RemarkEI) and such that given x of length 
n, we can compute in time 0{n^) the maximal i with < n. Such a sequence 
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can be obtained by standard methods, for details refer to the chapter on uniform 
diagonalization and gap languages in [7j. 

It is helpful to view the betting strategy of the martingale d as being per- 
formed in stages i = 0, 1, . . . where the bets of stage i depend on the g-images of 
the strings of length Ui . While considering the queries made for strings of length 
rii with i > 0, we will distinguish short queries with length strictly less than 



k 



rh I 

.2k\ 



( 5 ) 



and long queries, i.e. queries of length at least li. We call two strings x and y 
equivalent iff, for some i, both have identical length rii and in addition we have 



(i) h{x) = h{y) , (a) {z in g{x) : |z| < k} = {z in g{y) : |z| < k} , (6) 



i.e., two strings of length rii are equivalent iff they have the same truth-table 
and the same set of short queries. Then for some constant c, the number of 
equivalence classes of strings of length can be bounded as follows 

2^^ ^ - 2^’'(fc -I- 1) • < c-2^'^ = c-2t . 

j=o \ d J 



So the 2"* strings of length rii are partitioned into at most c • 2 t equivalence 
classes, whence for all sufficiently large i, there is an equivalence class of car- 
dinality at least rrii := [log . For all such i, among all equivalence classes of 
strings of length rii we choose one with maximal cardinality (breaking ties by 
some easily computable but otherwise arbitrary rule), we let Ji contain the first 
rrii strings in this equivalence class, and we let 



OCi 



\M 



( 7 ) 



Claim 1. For almost all i, 



1 1 
2 ’ 2^=+i 



3 1 

2 ’ 2'=+! 



( 8 ) 



Proof. Fix an index i and assume that (0) is false. Let z\ < ... < Zmt be 
the elements of Ji, let Dj = go{zj) for j = {!,..., mi}, and let D be the 
union of D\ through Dm,- If we let w = then by definition of Hq we have 

ai = a(R\D, w), whence (0 remains false with ai replaced by a{R\D, w). On the 
other hand, 0 is false with ai replaced by a{a,w) for at most a l/2*-fraction 
of all partial characteristic functions a with domain D because by 0) and the 
choice of the mi, we have m^ > m{k+ 1, 1/2*). Remark0then shows that while 
betting on R, a martingale can increase its capital by a factor of 2* by betting for 
all places in D on the l/2*-fraction of partial characteristic functions for which 
0 is false. Based on the latter fact we now construct a martingale, where we 
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leave it to the reader to show that the martingale can be computed in polynomial 
time. The initial capital 1 is split into Ci,C 2 , . . . where = 1/2* is exclusively 
used to place bets on the strings in the set D which corresponds to the index i, 
i.e., the strings which are in go{x) for some x in Ji. By the preceding discussion, 
the martingale can increase the capital Ci to at least 1 for all i such that ® 
is false. But if this were the case for infinitely many values of i, the martingale 
would succeed on i?, thus contradicting the assumption that R is p-random. □ 

While constructing the martingale d which is meant to succeed on B, we will 
exploit that by Claim Q] for almost all i, the density of the set A on Ji is confined 
to a small interval around the comparatively low value 1/2^+^. Let F be the 
functional which corresponds to the btt(fc)-reduction given by (g, h) - whence 
for example A is equal to F{B) - and for all i > io, let 



Hi = {z : z in g{x) and \z\ > k} , 

X in Ji 



i.e., Hi is the set of all long queries made by strings in Ji. Then we can argue 
that only for a fraction of all partial characteristic functions a with domain Hi 
the set r{{B,a)) has such low density on Ji. Formally, for every i > io and for 
every partial characteristic function a with domain Hi., we let 



,, , \F{{B,a))nJ,\ 

= m 



3 1 

2 ' 2 '=+! ’ 



and, further. 



Oi = {a '. a partial characteristic function with domain Hi and fdi{cr) < p} . 

By Claim Efor almost all i, the restriction of B to Hi must be contained in 0i. 
We will argue next that there is some 5 < 1 such that for almost all i, the set Oi 
comprises at most a (5-fraction of all partial characteristic functions with domain 
Hi. We will then exploit the latter fact in the construction of the martingale d 
by betting against the (1 — ^)-fraction of partial characteristic functions outside 
of &i which have already been ruled out as possible restriction of B to Hi. 

For the moment, let Tx be the Boolean function obtained from h{x) by hard- 
wiring B{z) into h{x) for all short queries z in g{x). Recall that by conven- 
tion queries are assigned to variables in length-lexicographical order, whence for 
equivalent strings x and y, the Boolean functions Tx and Ty are identical. Thus 
for every i, all strings in Ji are mapped to the same Boolean function, which we 
denote by r^. We call a Boolean function constant iff it evaluates to the same 
truth value for all assignments to its arguments (whence in particular all 0-placed 
Boolean functions are constant). 

Claim 2. For almost all i, Ti is not constant. 

Proof. If Ti is constant, then the value A(x) must be the same for all x in Ji. 
But then ai is either 0 or 1, whence Claim [0 implies that this is the case for at 
most finitely many indices i. □ 
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Claim 3. There is a constant 5 < 1 such that for almost all i, the set Oi comprises 
at most a i5-fraction of all partial characteristic functions with domain Hi. 



Proof. For given i such that Ti is not constant, consider the random experiment 
where we use independent tosses of a fair coin in order to choose the individual 
bits of a random partial characteristic function a with domain Hi. Then all 
partial characteristic functions of the latter type occur with the same probability, 
whence the fraction we want to bound is just the probability of picking an 
element in (9^. 

For every string x\n Ji, define a 0-1-valued random variable hx and, moreover, 
define a random variable 7^ with rational values in the closed interval [0, 1] by 



hx{a) := r{{B,a),x) , 






1 

W\ 



■ 

X in Ji 



Consider an arbitrary string x \n Ji. By assumption, Ti is not constant, whence 
there is at least one assignment to a such that hx is 1. Moreover such an assign- 
ment occurs with probability at least 1/2'"' because h{x), and thus also Ti, has at 
most k variables. Thus the expected value of bx is at least 1/2'" and by linearity 
of expectation we obtain 



= m E e(M > E ■ 

X in Ji 



' X in Ji 

If we let p be the probability of the event “7, < p” , we have 



^ <p-p+{l-p)-l<p+{l-p) = ^-^ + {l-p) , 



( 9 ) 



( 10 ) 



where the relations follow, from left to right, by ®, by definition of p and by 
7i < 1, because the probability p is bounded by 1, and by definition of p. But 
din» is obviously false in case (1 — p) is strictly less than 1/2'"+^, whence p can 
be bounded from above by (5 := 1 — 1/2'"+^. □ 



For all i, let C = {x : k < |a;| < k+i}. The Ui grow sufficiently fast such that for 
some ii and for all i > ii, the set Hi is contained in C. Moreover, by Claim 0 
for some 12 and all i > 12 , there is a set Oi of partial characteristic functions 
with domain Hi where, firstly, Oi contains only a J-fraction of all such partial 
characteristic functions, and, secondly, Oi contains the restriction of B to Hi. 
Let be the maximum of i\ and ^2- 

Now we are in a position to describe a betting strategy which succeeds on B. 
On input w, let x be the (|u'| -|-l)th string, i.e., the string on which we might bet. 
We first compute the index i such that x is in C, together with the corresponding 
set Hi. In case i < is or if x is not in Hi, we abstain from betting. Otherwise, 
we place a bet on x according to a betting strategy as described in Remark 0 
which, while placing bets on the strings in Hi, increases the capital by a factor 
of at least 1/6 by betting against the partial characteristic functions which are 
not in Oi. Here all necessary computations can be performed in time 2®*^"*^ and 
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hence, by |a;| > k = [nj/2fcj, in time It follows that this betting strategy 

induces a polynomial-time computable martingale which on interval li preserves 
its capital in case i < and increases its capital by a factor of at least 1/5 for 
all i > is- This finishes the proof of Theorem 2] □ 

Remark 5. The assertion of Theorem 0 remains valid if we simply require the 
set R to be n-random instead of p-random, (i.e., if we require that there is no 
martingale computable in time 0{n) which succeeds on R). For a proof, note that 
Ambos-Spies, Terwijn, and Zheng have shown in pj that for every n^-random 
set R, there is a p-random set Rq which is p-m-reducible to R while, in fact, the 
latter assertion is true for n-random R. Now the relaxed version of Theorem El 
follows because the existence of a separating set A as required in the theorem 
extends directly from Rq to R. 

Remark 6. Theorem^states that the lower p-btt(fc-|-l)-span of every p-random 
set R contains a set A which is not in the lower p-btt(fc)-span of any p-random 
set. As already noted in P], for a set R which is not just p-random but is even 
Martin-Lof-random, such a set A cannot be recursive. This follows from the fact 
that every recursive sets which is p-btt(fc-|- l)-reducible to a Martin-Lof-random 
set is in fact computable in polynomial time. The latter fact is attributed to 
folklore in PHI and can be obtained as a special case of a result of Book, Lutz, 
and Wagner in jOj. 



Corollary 7. For every p-random set R, the lower p-tt-span of R is not con- 
tained in the lower p-htt span o/p-RAND. 

Proof. For a given p-random set R and for every k, let be defined in the 

same way as the set A has been defined in (|HD in the proof of Theorem P] whence 
Ak+i is p-btt(fc-|-l)-reducible to i?, but is not p-btt(/c)-reducible to any p-random 
set. If we let 

B = {x : X = l^Oy and y in Ak} 

then by the definition of the sets Ak the set B is p-tt-reducible to R. On the 
other hand, if B were p-btt-reducible to some p-random set i? 0 i then B were 
in fact p-btt(fc)-reducible to i?o for some k. But then, in particular, Ak+\ were 
p-btt(fc)-reducible to Rq, thus contradicting the choice of Ak+\. □ 



5 Separations by Rec-random Oracles 



Lutz m showed that recursive martingales yield a reasonable measure concept 
for the class of recursive sets, where in particular the class of all recursive sets 
cannot be covered by a recursive martingale (see p24l2bl27j for discussion of 
measure concepts for the class of recursive sets). 

Next we state two results on rec-random sets which correspond rather closely 
to Theorem PI and Corollary Q on p-random sets. In connection with Theorem 0 
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and Corollary 0 recall from the introduction that rec-RAND is the class of sets 
which cannot be covered by a recursive martingale. Moreover, let btt-reducibility 
be defined like p-btt-reducibility, except that a btt-reduction is required to be 
computed by a total Turing machine which might run in arbitrary time and 
space, and let btt(/c)-reducibility be the restriction of btt-reducibility where the 
number of queries is bounded by k. 

Theorem 8. Let the set R be in rec-RAND and let k he a natural number. 
Then the lower v-(k + l)-tt-svan of R is not eontained in the lower btt(k)-span 
0 / rec-RAND. 



Corollary 9. For every set R in rec-RAND, the lower p-tt-span of R is not 
eontained in the lower btt-span o/ rec-RAND. 

Due to space considerations, we omit the proofs of Theorem 0 and Corollary 0 
which are essentially the same as in the case of p-random sets. 

Remark 10. Recall from Remark d that a global separation by a class C ex- 
tends to all nonempty subclasses of C. Furthermore, note that Schnorr m has 
implicitly shown that the class of Martin-Lof-random sets is a proper subclass 
of rec-RAND. As a consequence, the global separation by the class rec-RAND 
stated in Theorem 0 yields as a corollary the main result of Book, Lutz, and 
Martin in 0 who showed that for all k, the lower p-btt(fc-|- l)-span of a Martin- 
Lof-random set is never contained in the lower btt(fc)-span of the class of all 
Martin-Lof-random sets. 
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